Gresham College Lectures

Integral Transforms

April 21, 2022 Gresham College
Gresham College Lectures
Integral Transforms
Show Notes Transcript

Integral transforms are the most rarefied of analyses – only used by a subset of engineers and computer scientists; laboured over by many an undergraduate, usually with the accompanying moans; yet every computer, every electronic circuit is an incomprehensible jumble of wires without knowledge of integral transforms. 

The most common, the Fourier transform, is estimated to be the algorithm that is most computed in the world but what of Laplace and z-transforms? 

This lecture will explain without using daunting mathematics.


A lecture by Richard Harvey

The transcript and downloadable versions of the lecture are available from the Gresham College website:
https://www.gresham.ac.uk/lectures-and-events/integral-transforms

Gresham College has been giving free public lectures since 1597. This tradition continues today with all of our five or so public lectures a week being made available for free download from our website. There are currently over 2,000 lectures free to access or download from the website.

Website: http://www.gresham.ac.uk
Twitter: http://twitter.com/GreshamCollege
Facebook: https://www.facebook.com/greshamcollege
Instagram: http://www.instagram.com/greshamcollege

Support the show

- Hello, integral transforms is the topic of today's lecture. And what can I say about integral transforms, other than merely murmuring the words in an undergraduate class, is usually sufficient to let rip with the most appalling screams of anxiety. So, those of you who are here this evening, or those of you who're watching online, I congratulate you for your enterprise and bravery in putting up with integral transforms.(smacks lips) The reason I wanted to talk about them, is because in this series, we're talking about inventions and secret components of IT systems. And integral transforms are a vital key to the way modern IT works. But very few people really know how they work. So, what I want to do, is to take you through some of the mathematics and theory of integral transforms. I'm well aware that some of us and maybe even me, will get lost on our journey through integral transforms. So, if that's the case, don't worry, don't just mop your brow gently and the next slide will come along and will pick it up. So, when watching a mathematics seminar, it's not always easy to focus. In fact, I remember speaking to a professor of maths and I said,"Tim, how much of that seminar do you understand?" He said, "Well, I never understand maths seminars. I don't really understand them at all afterwards." So, if you don't understand this seminar, then, there's my excuse. Right. So, for those of you, who remember your, sort of, high school physics, might remember this circuit, which is called a potential divider. And I wanted a little example that gets us started. The example that I was going to think of, was something called impulse testing. There's an old engineering joke which says that in engineering, that if something moves and it shouldn't move, then, you should use (indistinct). And if something doesn't move and it should move, you should use a hammer. And those are the only two tools you need for engineering. Well, the hammer is what we call an impulse. So, it's a blow and is used as a way of testing things. And this box is known as a potential divider and if the boxes contained resistors, as they do in this case, then, you might remember from your school days, little formula which is the ratio of the voltage out of this thing to the voltage into it, is given by the ratio on the right-hand side there, which is the ratio of the two resistances. And a resistor, of course, is a device which is the voltage across this is proportional to the current going through it. Now, if I replaced one of those resistors with something that looked like this. Then, that funny-looking thing there is a capacitor. Don't panic if you don't know what a capacitor is. This is not going to be a circuit theory lecture. So, just bear with it. And my point really, is that, in order to describe this circuit, it's not as easy as just dividing some resistances. The capacitor has a response that depends upon the rate of change of voltage across it. So, the current flowing through it, depends upon the rate of change of voltage and we call that "dv/dt". And it's just a bit of mathematical notation, multiplied by a number called "c", the capacitor. So, bigger the capacitor, the higher the current that flows through it. So, now, we got this rate of change of something or gradient, our very simple bit of circuit analysis has got- well, it looks likes a bit more horrifying, really,'cause it's got a derivative and it's got this rate of change. And this thing over on the right-hand side, is called a differential equation. And I think it's fair to say, that undergraduates, on the whole, are not at all keen on such things. And it's probably also fair to say that professors, however, are very keen on them. And there's several reasons for that. I mean, partly, it's because solving differential equations is something that can demonstrate your prowess of mathematics and professors are very keen on doing that. And also, each solution of this, depends upon the type of voltage that we're going to put into it. So, there's a sort of infinite number of lectures that can be given. We just pick the input voltage. So, if we picked a sine wave, then, that would have one solution and if picked a cosine wave, that will be a different solution. If we picked a step function, that'd be a different solution. If we picked an impulse, it'd be a different solution. And you can imagine, this is very popular in undergraduate lecturing course,'cause we essentially pick one of these inputs and then, we spend between half an hour and 40 minutes on showing people how to solve differential equations for those conditions. And this good if you like torturing undergraduates, but it has a slight downside if we're trying to design things. Because we're going to have to, sort of, re-solve these differential equations, each time we get a different input. So, if, for example, the input look like this, this little pink line here, which is something called a step function. Looks a bit like that. Step, sometimes called the "Heaviside". You need step after Oliver Heaviside, the Irish mathematician. Then, the output of this circuit looks like this blue line, which is exponentially increasing. So, one way of thinking about that if you're a physics, sort of, person, is to think about a capacitor charging up. And it depends on the input voltage, but it also depends on something called the time constant, which is how quickly the current can flow into the capacitor and that's governed by how big the capacitor is and how big that resistor is. So, it's got this "e" to the "-t" form in there with some additive and multiplicative constants. Now, what to do if I had a more complicated waveform in here? Well, as I indicated, I'm going to be, sort of, pretty much, sort of, lumbered doing that. Because I'm going to have to solve a whole new set of equations. So, one thing I might do, is I might say,"Well, these exponentials seem to appear a lot in circuits. So, why don't I just, sort of, solve for a, sort of, generic exponential?"(smacks lips) And this is what I've done here. So, this on top here, is my differential equation. And here, I've just assumed a waveform, a voltage. It is of the form "e" to the power "st". So, that's an exponential. And the nice thing about "e" to the "st". I'm afraid, I've missed(indistinct) there, but I should've put it in there, it's the derivative of that, it's just multiplied by "s", kind of, a little bit of algebraic jiggery-pokery. I've converted this into a nasty-looking thing, into a nice thing, which is easily solvable and, in fact, with a little bit of algebraic manipulation, which no need to follow, I can get it almost into the form I had for my potential divider. Instead of a resistor, here and here, as I had in the previous case, I got something called "Z". And "Z" is something called (indistinct), which is a frequency-varying resistance and in this case, it's specified in terms of this variable "s". So, I haven't said what it is, but "s" is that exponent in the exponential. So, something rather beautiful has happened. It might not be, sort of, obvious. It might look a big ugly to you. But from my perspective, it looks rather sort of attractive, which is I've managed to make an assumption about the waveform here. And when I make that assumption, my nasty differential equation, becomes a simple, little polynomial equation, which I can solve or manipulate in a, sort of, using my high school mathematics. So, a very, very attractive idea. Now, you might protest at this point and you might say,"Well, hang on, Richard. I mean, 'e' to the 'st', this will very well solving for that. But what about the general case when I got all sorts of other waveforms, I might want to solve for?" Ah, right. Well, I've got two things to say. The first one, is there's a good trick in linear time-independent systems, which is something called superposition. And superposition is a wonderful trick that we use a lot.(smacks lips) And the idea behind superposition, is to think of the circuit we're analyzing as a, sort of, black box. So, in this case, I've got some signal coming in, which I've called "Vin" and I've got some signal leaching out, which I've called "Vout". And I've said, "Well, the output voltage, here, the signal, is some function of the input voltage." So, if my input signal was"x(t)" or something like that, then, what would come out is some function of "x(t)". If it was "y(t)", then, what would come out is some function of "y(t)". And superposition says that if my input signal could be written as some constant, multiplied by "x(t)" plus some constant multiplied by "y(t)", then, my output can be written as the sum of the component outputs. Now, that might not seem terribly useful, but it's super powerful. Because it means, we don't have to analyze big, complicated waveforms if we can break them down into components. So, if we could break this complicated waveform input into additive components, so, it's a bit of this, plus a bit of this, plus a bit of this, plus a bit of this, then, we only need to find the solutions for each one of those components and then, add them back up, the outputs. That's the first little bit of usefulness. So, if I could express a waveform has some sum of "e" to the "st", then, that's great. I've solved that a few slides back. So, all I need to do is add up the outputs.(smacks lips) Right. So, how could I convert a waveform into some sum of "e" to the "st", or "e" to the "-st", or something like that? Ah, well, why don't I measure how alike my waveform is to that waveform? So, how can I do that?(audience coughs) Well, one way is to multiply a waveform by the thing I'm interested in measuring, its alikeness to and then, average it. So, this is what I could do here. On the top here, that's me, writing that in, sort of, simple maths. So, this is the waveform I'm interested in. This is the thing I'm going to measure it against,"e" to the minus "st", I've put a minus in there to stop this thing shooting off the slide here. So, this is my "e" to the minus "st" waveform on the top, here. This is the thing I'm going to measure it against. So, let's just multiply those two together. And then, what I want to do down here, is just average all of these. And in mathematics, we normally write that like that. It's a big elongated "s" on the left-hand side, running from zero to infinity, stands for a summation or integral.(mumbles) This "dt" here, they always go together as a sort of brackets. So, I've said "0.63", I mean, I should have written clearly. I don't know quite what it is. But the idea here, is multiply the two together, average it in time and that measures how alike I am to my prototype, here. And this is usually called a kernel and not a colonel as in a military colonel, but a kernel with a "k".(audience coughs) Right. So, I can put those two things together now.(clicks) And that gives me something that looks pretty horrifying, but I'm going to break it down for you, called the Laplace transform. And Laplace transform is merely a measure of how alike some waveform, which I've called "f(t)", is to a kernel, "e" to the "-st". So, that is my time domain function of the waveform. That's the thing that's probably the input to the circuit or system. And then, this thing here, is the kernel. And then, these two things here, that's the averaging. And then, it's also worth thinking about what I get as the output. And that output, doesn't depend on time anymore. This waveform, here, depended on time, this one doesn't. It depends on "s" and "s" was a parameter of the kernel. So, at a, sort of, simple algebraic level, this is a bit of maths. So, it takes something that depends on time in one side and something that depends on "s" on the other side. That's why we call it a transform, it moves from one domain, as we call it, the time domain, into, in this case, the "s" domain. And it's called the Laplace transform after Pierre Laplace. And if you're all, sort of, freaking out about integrals and worried about all these, sort of, things and well, you might,'cause these integrals can be a bit tricky to invert actually. Then, we often use this curly symbol "l". So, take a signal, put it through the Laplace transform, out comes "F(s)". And although I haven't proved this, you can go back. So, we often use this double-headed arrow. Now, something great has happened at this point. I've now got a magic bit of maths that can convert signals into the "s" domain. I've, sort of, indicated that circuit analysis, by which I mean, electrical circuits but we could also be talking about mechanical circuits, can be analyzed quite easily in terms of the "s" domain. And the output, well, if we want to get back to the time domain and you might think, "Oh, of course, we want to get back to the time domain," we could just apply the inverse transform. The inverse transform can be tricky to compute. But usually, engineers use a sort of table lookup. So, they say, "Well, which one of those? Right, it must be one of those." Actually, a lot of engineering never bothers to convert it back to the time domain. Engineers, particularly, electronics and control engineers, usually do their thinking in the "s" domain and they usually won't trouble themselves with explicitly working out the time domain. There's no need, because they can visualize in the "s" domain, in the transform domain. So, that's the essence of an integral transform. It's a bit of mathematical jiggery-pokery that makes the analysis easier. And that sense, it's probably, fairly, sort of, standard affair for a lot of engineering and physics courses. Physics courses, usually, don't really go for the Laplace transform. They're usually into a different set of transforms. Well, there's a whole book of these and for some time at the end, we can try and think of some. But pretty much, any problem that you can think of, will have no appropriate transform. Laplace transform is very favored by control engineers and electronics engineers. So, you could design specialists. Nowadays, of course, we tend not to be designing a lot of analog circuits. We're often working in the digital domain. And in the digital domain, we tend not to have differential equations, we tend to have difference equations, as it is just differences of things, well, as a transform for that, as a transform for everything and that's called the z-transform. Bizarrely, the Laplace transform wasn't invented by Pierre Laplace. Laplace transform was probably invented by Niels Abel, I think he is a Norwegian mathematician. And Laplace, probably, invented the z-transform in a bizarre, sort of, twist of fate.(smacks lips)(inhales) So, I don't want to say too much about the z-transform, but that's the method of choice for designing digital photos and for worrying about the stability of objects. Now, I should say at this point, that we're talking about these mathematical tools as the way a mathematician would, as a way of, sort of, either making problem easier to solve and you might not feel they're easier to solve, but once you get the hang of these things, they're fairly, sort of, manageable and manipulable, or a way of understanding, or a way of, sort of, analysis. But that's not really- well, computer science has some of that, but I think I would also like to talk about these as, sort of, computational devices. And that means switching across to the, sort of, the king of transforms, which is the Fourier transform. And I haven't written down the integral form of the Fourier transform, because it's got a complex number in it and that might, I felt, that was probably a step too far amongst civilized company and is not necessary. Because once you got the idea, here, instead of "e" to the "st", we have something that depends upon omega, which is the angular frequency or if you prefer, "f","f" is the non-angular frequency, the frequency we've referred to."f" is measured in hertz and omega is measured in radians per second. They're coupled by 2 pies and omega equals two times pie times "f". Now, the Fourier transform is- did I call it the emperor of transforms? It's the transform that almost everyone knows about, but probably doesn't know about. So, they're familiar with the output from the Fourier transform, but they're probably less familiar with its computation.(smacks lips) So, let's just have a look at a few of those and try and, sort of, get a handle on how it works. So, probably, the simplest thing I could think of, was to refer to this as of start of a lecture, which is thinking about impulse testing. So, that's an impulse, is where we hit something with a hammer, or in terms of a voltage, that's where we just, sort of, ping it with a very short pulse. Here, I've used a formulation, which is a theoretical one, where instead of drawing a little pulse, I've drawn this little arrow here. And this is an example of an idealized pulse. It doesn't really exist but it is mathematically convenient. So, the little arrow is something that's meant to represent a pulse that's infinitely thin and infinitely high. So, it has an area of one, but it's as thin as you can make it and as tall as you can make it. They're often called generalized functions and this is an example of a generalized function, or also, called an impulse. But also, sometimes called a Dirac delta function, named after Paul Dirac, the Bristolian engineer turned theoretical physicist. And over on the right-hand side, I've plotted its Fourier transform. Some people would call it the spectrum. So, let's just have a look at it. So, I've got this highly localized thing in the time domain. Beep, click. And it contains all frequencies, over here. So, that's one of the reasons that impulse testing is often spoken about, particularly, in mechanical engineering,'cause if I tap something with a hammer, it excites all frequencies in the system. So, in the old days of steam trains, there was somebody who went around with a hammer, tapping wheels or steam trains, the idea being that, that provided input that contained all frequencies. So, if there was a defect in the wheel, then, there would be enough spectral energy in the impulse to excite that defect and a skilled operator could listen to the ringing associated with the wheel and know that there was a defect. Such people are known as wheel-tappers.(smacks lips) Let's have a look at something that's a bit more realistic. So, here is the Fourier transform of a pulse and this is a rectangular pulse, centered around time zero. So, its zero goes up and comes down again and this is its Fourier transform. Now, you might be slightly irritated to see frequency, positive and negative. This is an artifact of the way most people define the Fourier transform and for all practical purposes, you can just focus on the positive frequency part here and that's what you would see if you had a spectrum analyzer in front of you, that only the positive frequency. It's a mathematical artifact having these and, in fact, it's not necessary. You could define a Fourier transform that doesn't have them, but convention has it that we have them. And you can see here, we've got something. This is called a "sinx over x" function and "sinf over f" function. We've got quite a bit of low frequency energy here that's associated with this being lifted off the zero line and then, we've got these sidelobes down here. So, high frequency, there's not much energy and the energy there is, is associated with these rapid transitions here. If I would've picked a slightly different shaped pulse say, a longer pulse, you might expect that you'd have more energy focused at low frequencies, which it does. If I was to pick a shorter pulse, then, you might expect it to have more energy at the higher frequencies. So, can you see if I was to make that post thinner and thinner and thinner and taller and taller and taller, this would become wider and wider and wider, until eventually, it would be flat, which was the situation we looked at in the previous slide. So, in fact, that is usually the proof of the impulse of zero. So, in reality, your wheel-tapper doesn't manage to excite all frequencies,'cause he or she is only able to tap the wheel for a finite amount of time with a finite amount of energy and so, we have the shaping here, to the spectrum. Now, to engineers and analysts, you can work in this domain, or you can work in this domain, you can work in the time domain or the frequency domain and everything is (indistinct).(clicks) If I picked something that was well matched to my Fourier transform, like a cosine or sine wave and the Fourier transform can be written in terms of cosines and sine waves, then, I get two generalized functions in the frequency domain. So, this is a cosine wave or frequency "f0" and it has a spike, infinite height, zero width at "f0". So, the Fourier transform is particularly well-suited to oscillating waveforms, like cosines and sines. And that's one of the reasons it's popular, because a lot of things tend to oscillate and then, it'll go up and down in nature.(smacks lips) Right, well, so far, so good. I still haven't really got past, the, sort of, using the Fourier transform as a tool for understanding an analysis. And in fact, you might be feeling, you're understanding slightly worse than it was at the beginning of this, because of the mathematical complexity associated with transforms. I wanted to just do one more little diversion and then, we'll get on to it as an algorithm. And that relates to randomness. Of course, in reality, signals are most likely to be random and up to this point, I've been talking about signals that are defined by simple equation. But the reality isn't like that.(clicks) I might have some waveform like this, which is actually me saying the words "aggression", I think if I've remembered rightly. So, how could I handle that? Well, the Fourier transform can certainly be used, but it isn't used directly on the time domain signal. It's used on a measure of how alike the time domain signal is with itself. We call that autocorrelation and the way we work that out, is to use our trick of multiplying and averaging, again, except we multiply the signal by itself. So, we shift the signal by, in this case, an offset called (indistinct).(smacks lips) We multiply the two together and that particular shift here, is associated with one value here at a particular (indistinct). And then, we just shift this thing across here and as (indistinct) varies, we create this thing here, which is not time along here, this is delay, sometimes called lag. And if we take the Fourier transform of the autocorrelation, then, we, now, have something that we can work with and that is usually known as the power spectrum. So, not quite the same, obviously, spectrum that we've been dealing with as different units for one thing, but it has the same, sort of, intuitive idea, which is these features here, relate to components of interest in the signal. And when people are talking about, well, talk is often quite loose on what's meant by these things, but when people are talking about a spectrum, then, that's usually what they're talking about. They're talking about something that is technically the Fourier transform of the autocorrelation function. Now, in reality, computing the autocorrelation is a bit of a pain. There are some technical reasons for that, to do with convergence, but there's also a practical reason which is, it looks like a, sort of, batch process, doesn't it? You have to sit there, measure the signal, shift it around with a copy of itself, multiply it together, do all those integrals and add it up. So, in reality, you might not want to do that. So, there are various approximations in which you split the waveform into little chunks, you apply the Fourier transform to those little chunks and then, you do some averaging of all those chunks. And that can also be shown to produce the power spectrum. In fact, there are whole thesis, written on methods of estimating this statistic from the original time waveform, without having to go through the tedium of computing the autocorrelation to get that.(smacks lips) Right. Looks like a bit of a pain to compute this. We have to do this thing called an integral and we have to do it for every possible value of frequency. So, if you think back to that equation for the Laplace transform, if you imagine writing a program to compute it, you'd think, "Alright, so, I got to pick a particular 's', right? Get my 'e' to the 'st', right? Draw that out, multiply it by the time waveform, do the integral, average, average, average, average, average, right. That's one value, right. New 's'. Stretch it."(mumbles) Looks very tiresome, indeed. So, there's an approximation that we use to that Fourier transform, which is the Fourier transform applied to signals that are ready, discrete. Discrete, here, meaning separate. C-R-E-T-E, that sort of discrete, not E-E-T. And this is it. It's the discrete Fourier transform. And you might, sort of, think that, "Gosh, this doesn't look very much, like, what we had previously." Actually, it does have almost identical form with a few constants knocking around, different in each case. So, the first part of this is the signal itself(clicks) and that's there. And I called it a time domain signal and you might say, "Well, where is time?" Well, because it's a sample, because it's discrete, I'm using "n" to index it, rather than "t". But it's the same thing. So, it's the sample thing. And I'm multiplying it by something, that's the thing I want to measure how alike it is to and I'm calling that the kernel, just as I did with the previous thing. And because it's a discrete signal now, I'm not going to an integral, I'm just going to do a summation. So, that's that bit there. And once I've done that average, I called it an average, you might moan that I should be dividing by "n" for it to be an average. Well, let's not worry about"Divide by n", "Divide by t". It doesn't really matter, I should say. And then, over on the left-hand side, I've got my transform. And the thing about the transform, is it doesn't depend on "n"."n" is being summed away, if you like. It only depends on "m" and this is the equivalent of frequency, here. Now, this, this is the master work of modern computer science. So, this is the discrete Fourier transform, it's the discrete version of the Fourier transform.(smacks lips) I hesitated a bit over the "Divide by n" and all those things. It's probably worth saying that in old books, there are quite a few different definitions of Fourier transforms and they differ a little bit in some of the symbols and some of the normalizing constants. There are some "2 pies" that skulk around the place and some people want to divide by "root 2 pies", so that the inverse transform and the Fourier transform look the same. Doesn't really matter. Nowadays, there is general agreement on where these constants should sit and that's the definitions as the definitions I'm using here. Right. So, this thing is the algorithm that dominates modern computing. And I'll demonstrate, I'll try to explain why that is in a few moments' time. So, let's just spend a short while, just trying to understand what it does. So, if I apply the discrete Fourier transform or "DFT" as it's almost always called, to a discrete version of the cosine wave, which is what I've drawn here. So, this is a cosine wave. You can see it's a cosine wave, because it starts up here at its maximum value and then, it goes up and down, up and down and it ends just here. And I've chose the cosine wave that fits absolutely perfectly into the number of samples I've in my little graph here, which is, happens to be a 1024. And you can see the output, it looks a bit weird, but I'll explain that in a moment. Firstly, there is a spike here and that is at the frequency of this cosine wave. This spike, here, looks a bit odd. The way this thing is formulated, is these are all our positive frequencies up to around 512 and then, these have to be wrapped around over here, these are the negative frequencies that we were looking at and I said not to worry about. So, just for the time being, just consider this bottom part of the graph. So, that looks, well, pretty fab and groovy, really. I mean, you've got to compute it, but it gives a reasonable approximation to what we're expecting. It's not a Dirac delta,(indistinct) with infinite spike, because we have a finite amount of data here, so, we don't get an infinite spike, we get a finite spike. But it looks a bit like what we'd expect. But because it's a discrete algorithm, there are some nasties, which we always seem to have to take care of. Here's one. If I just change the frequency of this waveform a little bit, what you might expect, is that this spike would move this way or this way, left or right, depending on the frequency. But in fact, that does happen. But what also happens, if I do that, is some artifacts appear. I'm not sure if you can see them on the slides, I've just sketched them in here. They're called leakage, spectral leakage. And the spectral leakage is caused by the fact that this waveform here, at the end, doesn't exactly line up with the start. I didn't make this evident, but it's an artifact of the discrete Fourier transform. It makes an implicit assumption that the waveform is repeated out of the interval, exactly. So, when we choose a frequency that doesn't exactly fit into the interval, we create what look like big jumps. These cause spectral energy, which causes imperfections and the spectral leakage. Well, there are whole books written on ways to avoid spectral leakage, but the standard way, is to force the issue. And the way you force the issue, is by multiplying the input waveform with something called a window, okay? So, this is one that forces the "n" to zero. If you force the "n" to zero, obviously this thing lines up and the spectral leakage disappears. It does have a consequence, which is, as we squeeze those "n" down to zero, we broaden the peaks. Perhaps, I should've said this when I was talking about that rectangular pulse. I don't know if you noticed, but as the rectangular pulse changed width, the width of the "sinx over x" function also changed in exactly the opposite ratio. So, that's something called the time bandwidth product, which is always constant. And if you are a physicist, you would say, "Ah, yes. Well, I know that position and momentum of Fourier transform pairs and I know, due to the Heisenberg uncertainty principle, that position times momentum is always a constant." So, in other words, you can know precisely where you are, but not know how quickly you're going. We can know precisely how quickly you're going, but not how and precisely where you are. Always a good excuse if you're stopped by a policeman on the motorway and you're a physicist. And the same thing applies in everyday life. So, you can either know precisely where you are in time, but not know the frequency very precisely, or you can know the frequency precisely, but not know where you are. And that's a consequence of the finite window of analysis, which was seen in this case. So, there's a whole, sort of, sub-genre associated with using the discrete Fourier transform in an elegant way. Now, I should say, that you probably heard that the term "DFT", discrete Fourier transform, isn't actually used that much nowadays. Almost everyone would use the term "FFT". And the "FFT" stands for fast Fourier transform. It always used to stand for one algorithm, known as the "Cooley-Tukey radix-2 algorithm". But there is now a whole library of fast Fourier transform algorithms. So, when people say the "FFT", they could mean any algorithm for computing it quickly. And I may have used them interchangeably in this lecture and if I have, I think that's now acceptable. If we were in an undergraduate lecture, they'd be saying, "No, they're completely different." But I think in common parlance, nobody computes a "DFT" the slow way. Everybody computes it, using one of the numerous fast algorithms there are, out there. The fast algorithms arise because the kernel function, in a Fourier transform, rotates. And because it rotates, we can find symmetry in the algorithm. So, we can split the multiplications, associated with the kernel function into multiple separate multiplications and that gives us computational efficiency.(clicks) Right. Let's look at some examples of how it's applied. Because I want to, sort of, dwell a bit on how the "FFT", the "DFT" is used. And the first example of this I've picked, is something I've been longing to talk about for quite a few lectures and I realize, "I should've talked about it in my compression lecture," but it just wasn't enough time. So, I've, sort of, snuck a bit of compression into this lecture, partly to fill a hole that I felt need to be filled. This graph, which is quite an old graph now, talks about the response of the human ear and we're describing an effect here known as masking. And masking is a property that arises from the way the human cochlea, which is the human Fourier transformer, that's in your ear, that's how it works. And the human ear is incapable of hearing certain things when certain signals are present. So, for example, if a signal falls below this smooth threshold here, labeled "threshold in quiet", then, we probably won't hear it at all. If you can't read this on the bottom here, we have a frequency axis, measured in kilohertz and up here, we have sound pressure level, measured with respect to some reference. I can't remember what the reference is. There's also, I don't know if you can see this, but there's a second graph here and this threshold arises from the presence of a tone. So, if we had a tone, which I've sketched here as a Dirac delta symbol. If we had a tone, here, of this intensity, then, any signal that falls within this, is not going to be distinguishable by human ear. So, if you think about the signals that you're used to listening to, if there is a substantial tonal component of, say, a music signal, let's say you're listening to someone playing the violin, so, maybe the fundamental or the second harmonic of the violin dominates what you're listening to, there are other bits of the musical signal that you can't hear. So, there's clearly some potential there for not sending those across the airwaves or onto the disk at all. And that's the basis of what's called lossy compression. Lossy compression is when we're prepared to completely eliminate parts of the signal, because we think the human at the other end, won't receive it. So, several important points there. It's the human on the other end. So, if your dog is listening, I'm not so confident, because we don't know how dogs and cats and non-humans, well, this is a human-only system. So, if you are listening to, or sending compressed digital audio and you almost certainly are doing that, well, certainly during the duration of the pandemic, if you're on "Zoom", "Teams", most of us were virtually every hour of the waking day and several hours of a non-waking day if you're working in other countries, you were doing this every single moment.(clicks) And what you were doing, was following something like the MP3 coding standard. And this is a diagram of it, but there are different acoustic standards. But the MP3 one is, sort of, canonical and the basic idea is repeated again and again. And what I'd like to draw your attention to here, is we've got this digital audio coming in here, the first thing that's happening over here, is a fast Fourier transform algorithm. Well, we've just talked about that. Then, we're running this model. So, what the model is doing, is it's looking for these spikes and it's trying to work out what the masking threshold is. And having worked out what the masking threshold is, it's then, going to use more or fewer bits to code bits of that signal. To do that, it has to do it in the frequency domain and not in the time domain. So, this part of the algorithm, up here, top-left, splits the input signal into sub-bands, alright? Then, transforms them, usually using a modified discrete cosine transform, which is a type of Fourier transform. It then, applies a different number of bits to each one of these bands. It then, does some standard run-length encoding and compression and slams it down the audio stream. So, if you're listening to a digital television, or a digital radio, or you're broadcasting via "Zoom" or "Teams", or whatever you're using, you're doing something like this. And the rate that you're doing this is fairly amazing, doing this, probably, well, if it's high-quality audio, 44.1 or 48 kilo samples per second. So, you're doing a lot of it. Now, this is one of the algorithms that fall, I think, in the discrete Fourier transform or "FFT" hall of fame. And I'll talk a little bit about some of these now. But before I do that, I thought I'd draw your attention to this quote from Gilbert Strang. Strang is bit of a hero of undergraduates for his explanations of matrix, maths and linear algebra. And he has described this as the most important algorithm of our lifetime. And let's just talk about some of these briefly. So, audio coding we've talked about. The idea there is the"FFT" is certainly used for computing the psychoacoustic masking. It may, in some versions, also be used for the sub-band filtering, filtering the bands and applying them. Subsequently, to that, you're going to apply different amount of bits. Not all standards use that. Some of them use filters, some use the "DFT". Video coding, I've talked about in previous lectures. But if you're watching this online, the video that you're watching has been split into eight-by-eight pixel blocks. Those blocks have been transformed, using a version of the discrete Fourier transform, or the discrete cosine transform, which is tuned to all positive values of these all that you're getting, images. Then, some of those coefficients are sent but others are discarded, because we don't think that human eye can look at them. And in previous lectures, I've certainly looked and the "DCT" does that. So, not only was your "Zoom" call, it was the audio coded using the "DFT". The video was coded using the "DCT", which is a type of Fourier transform. Speech recognition. There are various ways of doing speech recognition, but one of the standard ways and I think I discussed this in a lecture a few years ago, is (indistinct) use "Mel-frequency cepstral coefficients"."MFCC" is, as they're called, are a measure of how the spectrum of the audio changes or what the shape of the spectrum is. And the way they measure that shape, is rather interesting. They use, first of all, the discrete Fourier transform to compute estimates of the spectrum in blocks. And then, they measure the spectrum of the spectrum. So, they apply the Fourier transform to the spectrum to work out that shape and it's the spectrum of the spectrum, or cepstrum as it is often called. Cepstrum is a little joke, where you interchange some of the letters to indicate that its slightly different from its spectrum, spectrum of the spectrum, to characterize the voice. One interesting things about the speech recognition, of course, is that it has to happen continuously. It's no good to, sort of, waiting a goodly amount of time, computing everything. We have to do it now, otherwise, there's too much delay in the system. So, the speech engineers divide the speech into little blocks, usually, 20 milliseconds. They compute the "DFT" in a block and then, they overlap. So, they wait 10 milliseconds before computing another block and another block and another block and another block. So, there's always this overlap. They use windows that exactly overlap. So, in principle, you could add them back together and get the original waveform. So, they use this overlap idea and that allows them to either do recognition or speech coding. The overlapped transform, that is particularly tuned to being able to add them back together in an easy way, is known as the modified "DCT". So, health system, they use for that or "MDCT", which again, is a form of discrete Fourier transform. Digital TV. Well, we've talked about digital TV in the coding, but the "FFT" is also used in the transmission. Because they use, as does cellular telephone transmission nowadays, 5G uses something called "OFDM","Orthogonal Frequency Domain Multiplexing" modulation, yeah, modulation, I think. And the way that works, is they say, "Well, it'd be very convenient if the band, frequency band we had available, wasn't completely used by one person. So, we want to split it into very precise little sub-bands and we want to, say, (indistinct) phone to go down this little band and Keith's phone to go down here and Vierra's phone to go down this one. Right, but we want to be able to do this adaptively. Because sometimes, we're very busy and we want to restrict people's access and sometimes, it's not so busy and we don't mind having that. Is there a way of doing this?" Well, what they do, is they say,"Right, well, we'd like to put them onto a little carrier waveforms and we'll multiply those carrier waveforms by the signal from, say, (indistinct)." How can I do that when I'm not quite sure what the carriers are and they're going to have to vary, when I'm overseas, it's different. I know what I'll do, is I'll use an inverse "DFT" to create all of these. So, I'll multiply the signal by a sine wave and then, I'll put them into an inverse "DFT" and create a time waveform, that is essentially the sum of all these little channels. So, your mobile telephone, is doing a discrete Fourier transform as often as it can. In fact, one of the reasons your mobile phone gets hot and then, one of the reasons why your digital TV gets hot, is because there's a chip in there, just hammering out the discrete Fourier transform, without halt. Okay, I haven't really got time to talk about these other applications, but there's an awful lot of them. But I just wanted to mention one that well, once you get to a certain venerable age, you start to worry about various things, including your health. And health is always popular, so, I thought I'd quickly talk about the connection with one of these things. So, this is, actually, I think I was going to say it's a CT scan. I don't think it might be an "MRI". Doesn't really matter. The idea behind these things, is that on one side of your head or your body, is something that transmits something, maybe some ionizing radiation or non-ionizing radiation. And on the other side, we have a receiver and they rotate around. You can hear them rotating as you sit in these things. So, over on the right-hand side, here, is the basic idea. We've got something that rotating and what we measure is the total transmission through the body. So, we're not quite measuring, you can't really do very much with this. This thing is usually called a sinogram. And this thing on the right-hand side of the sinogram, is actually another form of integral transform and it's called a radon transform. Of course, you've got this thing on the right-hand side and you would like this lovely on the right, you'd like to see into your brain or see into your pancreas or whatever it is you're imaging. And so, you have to solve this inverse transform problem. I haven't talked much about inverse transforms and that's because, the mathematics is a bit more complicated. But in order to solve the inverse radon transform problem, you usually use the discrete Fourier transform. Yes, it even appears in medicine. So, you take the Fourier transform of this sinogram and then, you multiply it by wavenumber, which is the spatial equivalent of frequency, take the inverse, "FFT" and that's when you get a picture of your brain. So, the next time you're lying in one of these things, feeling a bit scared for your life and wondering what the hell is going on, you can engage the radiologist in an entertaining repartee about which Fourier transform algorithm they're using and the window that they used in order to get it looking good or not as the case maybe. Now, I can see time is pressing upon us. I do want to say one thing though, which is the Fourier transform is a rather interesting thing and it's well understood by physicists and engineers on his part of the syllabus. It isn't part of the syllabus for computer scientists and programmers and software engineers. They may know it, they may know something of it. But for a lot of people, the understanding is probably a little bit less than a black box. It's a thing you put numbers into and out comes some other numbers. So, that's a bit of a concern to me. Because it's far more than that and, in fact, there are simple tests you can do to see that it works and we've talked about some of them here. If we put in, say, a spike then, we should expect to see all the numbers come out equal. If we put in a cosine wave, we should expect to see spikes come out and so on. And it is quite easy to be intuitive about what these things are working and whether they're working or not. And that's an increasing issue for modern computers that bothers me a bit that there's a sort of black-box approach to design and understanding, which means it's difficult to be confident that the system you're using, is, in fact, doing what it's meant to be doing. And Fourier transform is perhaps the most low-level box, I could think of. It's probably the lowest-level box that isn't understood by most people that I could think of. But there are others. And the others, you could, sort of, wrap up into that interface between the real world and the thing that you're doing on the real world. And that interface with the real world, in a computer system, is usually called an operating system. And so, that is the topic of the next lecture and in fact, will be the topic of my final lecture. Thank you.(audience applauds)- Thanks so much, Professor Harvey. We've got a couple of questions from the online audience, which I was going to start with and then, we can move on to the audience in the room. The first question is,"Why is the length of the FFT always a power of two?"- It's not.(thuds) So, it always used to be, because there was only one algorithm for computing the "FFT" quickly and it was called the Cooley-Tukey algorithm and it worked in powers of two, worked on lengths of signals of powers of two. Nowadays, there are plenty of other choices, prime number transforms and so on. And there's a whole books of these choices. So, it doesn't need to be power of two. So, it's pure mystique that it is a power of two. And perhaps, what the question really means is,"Why the hell did you pick a power of two to demonstrate it?" And I don't know the answer to that. I should have picked something that wasn't a power of two.(claps)- Second question from online,"What is a spectrogram?"- Ah, spectrogram. So, the (indistinct) illustrated this lecture with a spectrogram, actually. And a spectrogram is a set of "DFT", taken one after the another and displayed using pseudo colors. So, they're very popular with speech engineers. A friend of mine can look at the spectrogram and know what was said, which is a wonderful trick. If you'd like to know more about spectrograms, then, there's an online lecture from me on speech recognition called"How to recognize speech?"- Thanks so much, Professor Harvey and please do come to the next lecture on the 31st of May, 6 o'clock on operating systems. Thank you so much.- Thanks.(audience applauds)