ImageMagick v6 Examples --
Fourier Transforms

Index
ImageMagick Examples Preface and Index
Introduction
The Fourier Transform
FFT/IFT In ImageMagick
Properties Of The Fourier Transform
Practical Applications
Advanced Applications


Introduction

One of the hardest concepts to comprehend in image processing is Fourier Transforms. There are two reasons for this. First, it is mathematically advanced and second, the resulting images, which do not resemble the original image, are hard to interpret.

Nevertheless, utilizing Fourier Transforms can provide new ways to do familiar processing such as enhancing brightness and contrast, blurring, sharpening and noise removal. But it can also provide new capabilities that one cannot do in the normal image domain. These include deconvolution (also known as deblurring) of typical camera distortions such as motion blur and lens defocus and image matching using normalized cross correlation.

It is the goal of this page to try to explain the background and simplified mathematics of the Fourier Transform and to give examples of the processing that one can do by using the Fourier Transform.

If you find this too much, you can skip it and simply focus on the properties and examples, starting with FFT/IFT In ImageMagick

For those interested, another nice simple discussion, including analogies to optics, can be found at An Intuitive Explanation of Fourier Theory.

The lecture notes from Vanderbilt University School Of Engineering are also very informative for the more mathematically inclined: 1 & 2 Dimensional Fourier Transforms and Frequency Filtering.

Other mathematical references include Wikipedia pages on Fourier Transform, Discrete Fourier Transform and Fast Fourier Transform as well as Complex Numbers.

My thanks to Sean Burke for his coding of the original demo and to ImageMagick's creator for integrating it into ImageMagick. Both were heroic efforts.


The Fourier Transform

An image normally consists of an array of 'pixels' each of which are defined by a set of values: red, green, blue and sometimes transparency as well. But for our purposes here we will ignore transparency. Thus each of the red, green and blue 'channels' contain a set of 'intensity' or 'grayscale' values.

This is known as a raster image 'in the spatial domain'. This is just a fancy way of saying, the image is defined by the 'intensity values' it has at each 'location' or 'position in space'.

But an image can also be represented in another way, known as the image's 'frequency domain'. In this domain, each image channel is represented in terms of sinusoidal waves.

In such a 'frequency domain', each channel has 'amplitude' values that are stored in locations based not on X,Y 'spatial' coordinates, but on X,Y 'frequencies'. Since this is a digital representation, the frequencies are multiples of a 'smallest' or unit frequency and the pixel coordinates represent the indices or integer multiples of this unit frequency.

This follows from the principal that "any well-behaved function can be represented by a superposition (combination or sum) of sinusoidal waves". In other words, the 'frequency domain' representation is just another way to store and reproduce the 'spatial domain' image.

But how can an image be represented as a 'wave'?

Images are Waves

Well if we take a single row or column of pixel from any image, and graph it (generated using "gnuplot" using the script "im_profile"), you will find that it looks rather like a wave.

  convert holocaust_tn.gif -colorspace gray miff:- |\
    im_profile -s - image_profile.gif
[IM Output] ==> [IM Output]

If the fluctuations were more regular in spacing and amplitude, you would get something more like a wave pattern, such as...

  convert -size 20x150 gradient: -rotate 90 \
          -function sinusoid 3.5,0,.4   wave.gif
  im_profile -s wave.gif wave_profile.gif
[IM Output] ==> [IM Output]

However while this regular wave pattern is vaguely similar to the image profile shown above, it is too regular.

However if you were to add more waves together you can make a pattern that is even closer to the one from the image.

  convert -size 1x150 gradient: -rotate 90 \
          -function sinusoid 3.5,0,.25,.25     wave_1.png
  convert -size 1x150 gradient: -rotate 90 \
          -function sinusoid 1.5,-90,.13,.15   wave_2.png
  convert -size 1x150 gradient: -rotate 90 \
          -function sinusoid 0.6,-90,.07,.1    wave_3.png

  convert wave_1.png wave_2.png wave_3.png \
          -background black -compose plus -flatten  added_waves.png
[IM Output]  + [IM Output]  + [IM Output] ==> [IM Output]

See also Adding Biased Gradients for a alternative example to the above.

This 'wave superposition' (addition of waves) is much closer, but still does not exactly match the image pattern. However, you can continue in this manner, adding more waves and adjusting them, so the resulting composite wave gets closer and closer to the actual profile of the original image. Eventually by adding enough waves you can exactly reproduce the original profile of the image.

This was the discovery made by the mathematician Joseph Fourier. A modern interpretation of which states that "any well-behaved function can be represented by a superposition of sinusoidal waves".

In other words by adding together a sufficient number of sine waves of just the right frequency and amplitude, you can reproduce any fluctuating pattern. Therefore, the 'frequency domain' representation is just another way to store and reproduce the 'spatial domain' image.

The 'Fourier Transform' is then the process of working out what 'waves' comprise an image, just as was done in the above example.

2 Dimensional Waves in Images

The above shows one example of how you can approximate the profile of a single row of an image with multiple sine waves. However images are 2 dimensional, and as such the waves used to represent an image in the 'frequency domain' also needs to be two dimensional.

Here is an example of one such 2 dimensional wave.

The wave has a number of components to it.

Image example


FFT/IFT In ImageMagick

 
To be re-written, based on the section above.

In ImageMagick's initial implementation, as of version 6.5.4-3, any non-square image or one with an odd dimension will be padded automatically to be square at the maximum of the image width (N) or height (M) and to have even dimensions prior to applying the Forward Fourier Transform. The consequence of this is that after applying the Inverse Fourier Transform, such an image will need to be cropped back to its original dimensions.

As the Fourier Transform is composed of complex numbers, the result of the transform cannot be visualized directly. Therefore, the complex transform is separated into two component images in one of two forms.

More commonly, the two components that are created are the magnitude and phase representations of the complex numbers. The advantage of this is that both contain only non-negative values. The magnitude, by definition, is always non-negative. However, the phase normally ranges from -π to +π.

However, as images can not store such floating point values, it is usual to shift it while doing the forward transform to the range 0 to 2π and then shift it back when doing the inverse transform. This is also multiplied so as to span the span the fully dynamic range for images, from 0 to QuantumRange (as determined by the In-memory bit Quality used by this version of ImageMagick).

The magnitude has no fixed range of values. However, one important feature is that the value at the DC or zero frequency position will be the average color value for the whole image. Typically this also becomes the largest value for all the magnitude components. Usually all the other values in the magnitude image will be smaller than this. As a consequence of this the magnitude image generally will appear to be dark or even totally black to the naked eye.

By scaling the magnitude and applying a log transform of its intensity values usually will be needed to bring out any visual detail. The resulting 'log-transformed' magnitude image is known as the image's 'spectrum'. However remember that it is the 'magnitude' image, and not the 'spectrum' image, that should be used for the inverse transform.

The 'magnitude' and 'phase' component images are generated in ImageMagick using the "-fft" operator. While the "-ift" operator will take those two images and transform them back into the original image.

 

Using FFT

Now, lets simply try a Fourier Transform round trip on the Lena image. That is, we simply do the forward transform and immediately apply the inverse transform to get back the original image. Then we will compare the results to see the level of quality produced.


  time convert lena.png -fft -ift lena_roundtrip.png

  echo -n "RMSE = "
  compare -metric RMSE lena.png lena_roundtrip.png null:
  echo -n "PAE = "
  compare -metric PAE  lena.png lena_roundtrip.png null:
[IM Output] ==> [IM Output]
[IM Text]

The "compare" program above returns a measure of how different the two images are. In this case you can see that the general difference is very small, of about 0.22%. With a peak value difference in at least one pixel of about ("PAE", Peak Absolute Error) of just about 1%.

You can improve this by using a HDRI version of ImageMagick. (See FFT with HDRI below).

Lets take a closer look at the FFT images that were generated in the above round trip.

  convert lena.png -fft +adjoin lena_fft_%d.png
[IM Output]
Original
==> [IM Output]
Magnitude
[IM Output]
Phase

As John M. Brayer has said on his page about Fourier Transforms... We generally do not display PHASE images because most people who see them shortly thereafter succumb to hallucinogenics or end up in a Tibetan monastery.

Note that the "-fft" operator generated two images, the first image is the 'magnitude' component (yes it is mostly black with a single colored dot in the middle), while the second, almost random looking image, contains the 'phase' component.

PNG images can only store one image per file as such neither the "+adjoin" or the '%d' in the output filename was actually needed, as IM would handle this. However I include the options in the above for completeness, so as to make it clear I am generating two separate image files, not one. See Writing a Multi-Image Sequence for more details.

As two images are generated the magnitude image (first of zeroth image) is saved into "lena_fft_0.png" and phase image (second image) into "lena_fft_1.png".

You can also save these images into into a single file format allows multiple images, and removing the +adjoin" and '%d' components of the above. For example by saving it into the "MIFF", file format...

  convert lena.png -fft lena_fft.miff

Or you can save them into complete separate filenames using "-write" (See Writing Images)...

  convert lena.png -fft \
          \( -clone 0 -write lena_magnitude.png +delete \) \
          \( -clone 1 -write lena_phase.png +delete \) \
          null:

Note that in the above I used the special "NULL:" image format to junk the two images which are still preserved in memory for further processing.

And finally we again read in the two images again, so as to convert it back into a normal 'spatial' image...

  convert lena_magnitude.png lena_phase.png -ift lena_restored.png
[IM Output] [IM Output] ==> [IM Output]

Both images generated by the FFT process are very sensitive to modification, where even small changes can result greatly distorted results. As such it is important to never save them in any image format that could distort those values.

To prevent sever distortions when saving FFT images. It is best not to save them to disk at all, but hold them in memory while you process the image.

If you must save then it is best to use the Magick File Format "MIFF" so as to preserve the image at its highest quality (bit depth). This format can also save multiple images in the one file. For script work you can also use the verbose "TXT" Enumerated Pixel Format.

DO NOT save them using "JPEG" or "GIF" image formats.

If you must save these images into files for actual viewing, such as for a web browser, use the image format "PNG" (as we do in these examples). However it can only store one image per file.

The "TIFF" file format can also be used though is not as acceptable for web browsers. But it does allow multiple images per file.

It is also important to remember that both images are needed when recovering the image from the frequency domain. So it is no good saving one image an not the other if you plan on using them for image reconstruction.

Magnitude or Phase Only Images

Finally, lets try reconstructing an image from just its magnitude component or just its phase component.


  convert lena_fft_0.png  -size 128x128 xc:gray50  -ift lena_magitude_only.png

  convert -size 128x128 xc:gray1  lena_fft_1.png  -ift lena_phase_only.png
[IM Output]
Magnitude Only
[IM Output]
Phase Only

You will note from this that it is the phase image that actually contains most of the position information of the image, while the magnitude actually holds much of the color information. This is not exact, as there is some overlap in the information, but that is generally the case.

Note that the magnitude image generated for the 'Phase Only' image in the above is actually only 1% gray (almost pure black). Even so it still produces patches of very intense pixels, especially along edges. That is while position information is present, it has no real sense of what color things should actually be.

Just remember both images are needed to reconstruct the original image.

Frequency Spectrum Image

You will have noted that the magnitude image (the first or zeroth image), as appears almost totally black. It isn't really, but to our eyes all the values are very very small. Such an image isn't really very interesting to look at to study, so lets enhance the result with a log transform to produce the a 'frequency spectrum' image.

This is done by applying a strong Evaluate Log Transform to a Normalized 'magnitude' image.

  convert lena_fft_0.png -auto-level -evaluate log 10000 \
          lena_spectrum.png
[IM Output] ==> [IM Output]

Now we can see the detail in the spectrum version of the magnitude image.

You may even see some specific colors in the spectrum image, but generally such colors are unimportant in a spectrum image. It is the overall intensity of each frequency, and the patterns they produce that is far more important. As such you may also like to gray-scale the spectrum image after enhancement.

How much of a log enhancement you need to use depends on the image, so you should adjust it until you get the amount of detail you need to clearly see the pattern of the images frequency spectrum.

Alturnativally you can use the following small shell script, to calculate a log scaling factor to use for the specific magnitude image.

  scale=`convert lena_fft_0.png -auto-level \
          -format "%[fx:exp(log(mean)/log(0.5))]" info:`
  convert lena_fft_0.png -auto-level \
          -evaluate log $scale    lena_spectrum_auto.png
[IM Output]

However remember that you can not use a spectrum image, for the inverse "-ift" transform as it will produce overly bright image.

  convert lena_spectrum.png lena_fft_1.png -ift lena_roundtrip_fail.png

Basically as you have enhanced the 'magnitude' image, you have also enhanced the resulting image in the same way, producing the badly 'clipped' result shown.

[IM Output]

HDRI FFT Images

Talk about Accuracy here

However such images, while more exactly representing the frequency components of the FFT of the image, can contain negative and fractional values must be kept in memory, or saved to a limited number of file formats that is capable of handling floating-point values.

Floating point compatible file formats include "MIFF", "TIFF", "PFM" and the HDRI specific "EXR" file format.

For example here I use a HDRI version ImageMagick to generate another 'round trip' conversion of an image.

  # HDRI version of IM used
  time convert lena.png -fft -ift lena_roundtrip_hdri.png

  echo -n "RMSE = "
  compare -metric RMSE lena.png lena_roundtrip_hdri.png null:
  echo -n "PAE = "
  compare -metric PAE  lena.png lena_roundtrip_hdri.png null:
[IM Output]
[IM Text]

If you compare the results above with the previous non-HDRI comparison...
[IM Text]
You will see that the HDRI version of IM produced a more accurate result. Though with very slight speed penalty. It also required twice a much memory as a normal Q16 IM (See Compile-Time Quality).

In later examples processing an FFT of an image, will need such accuracy to produce good results. As such as we proceed with using Fast Fourier Transforms, a HDRI version ImageMagick will become a requirement.

FFT as Real-Imaginary Components

So far we have only represented 'components' that define the sine waves of an image (its "Fourier Transform") using a 'Magnitude' and a 'Phase' form.

However there is another way you can represent the waves that make up a Fourier Transform, using a mathematical idea known as "Complex Numbers".

For example if we take the 'Magnitude' and 'Phase' of a sine wave and plot them using 'Polar Coordinates' You can convert the two values into different mathematical concept of 'Real' and 'Imaginary' complex number components.
[Diagram]
Magnitude/Phase
[Diagram]
Complex Number
Real/Imaginary

This means you can also represent a FFT using a complex number, which also requires two components, and thus two images to represent one image.

However while the 'Magnitude' (r) and 'Phase' (θ) values are easily saved as positive values, 'Real' and 'Imaginary' values can be either positive or negative. As negative values may be involved, you have to use a specially compiled HDRI form of ImageMagick.

To generate a FFT of an image using Complex Numbers you would use the 'plus' form of the operators, "+fft" and "+ift". And can only save such images using special HDRI image file formats.

For example here I used a HDRI version of IM to also perform a 'round trip' FFT of an image, but this time generating Real/Imaginary images.

  # HDRI version of IM used
  time convert lena.png   +fft +ift   lena_roundtrip_ri.png

  echo -n "RMSE = "
  compare -metric RMSE lena.png lena_roundtrip_ri.png null:
  echo -n "PAE = "
  compare -metric PAE  lena.png lena_roundtrip_ri.png null:
[IM Output]
[IM Text]

As mentioned you must use a HDRI version when you use the plus forms to generate Real/Imaginary FFT images. If you don't the resulting image with have a 'dirty' look about them due to the 'clipping' of negative values. For example...

  # non-HDRI Q16 version of IM used
  convert lena.png   +fft +ift   lena_roundtrip_ri_bad.png
[IM Output]

The other thing to remember is that which form of FFT images you generate will also effect all image processing operations you want to apply to the FFT images. They are very different images, and as such they must be processed in different ways, with different mathematical operations.

Also as before you must have both Real and Imaginary component images to restore the final image. For example, here is what happens if we substitute a 'black' image for one of the components.

  # HDRI version of IM used
  convert lena.png +fft -delete 1 \
          -size 128x128 xc:black +ift lena_real_only.png
  convert lena.png +fft -delete 0 \
          -size 128x128 xc:black +ift lena_imaginary_only.png
[IM Output]
Real Only
[IM Output]
Imaginary Only

You can see from this that both Real/Imaginary FFT images contain vital information about the original image fairly equally. The biggest difference between the two is the special DC or 'average color' value, which we will look at next.


Properties Of The Fourier Transform

FFT of a Constant Image

Lets demonstrate some of these properties.

First lets simply take a constant color image and get its magnitude.

  convert -size 128x128 xc:gold constant.png
  convert constant.png -fft +delete constant_magnitude.png
[IM Output] ==> [IM Output]

Note that the magnitude image in this case is really pure black, except for a single colored pixel in the very center of the image, at the pixel location width/2, height/2. This pixel is the zero frequency or DC ('Direct Current') value of the image, and is the one pixel that does not represent a sine wave. In other words it is just the FFT constant component!

To see this single pixel more clearly lets also magnify that area of the image...

  convert constant_magnitude.png -gravity center -extent 5x5 \
           -scale 2000% constant_dc_zoom.gif
[IM Output]

Note that the color of the DC point is the same as the original image.

Actually it is a good idea to remember that what you are seeing is three values. That is the FFT images generated is actually three separate Fast Fourier transforms. A FFT for each of the three red, green and blue image channels. The FFT itself has no real knowledge about colors, only the color values or 'gray-levels'.

Effects of the DC Color

In a more typical non-constant image, the DC value is the average color of the image. The color you should generally get if you had completely blurred, averaged, or resized the image down to a single pixel or color.

For example lets extract the DC pixel from the FFT of the "Lena" image.

  convert lena.png -fft +delete lena_magnitude.png
  convert lena_magnitude.png -gravity center -extent 1x1 \
          -scale 60x60   lena_dc_zoom.gif
[IM Output] ==> [IM Output] ==> [IM Output]

As you can see the average color for the image is a sort of 'dark pink' color.

Another way of thinking about this special pixel is that it represents the center 'bias' level which all the other sine waves modify the image colors.

For example lets replace that 'dark pink' DC pixel with some other color such as the more orange color 'tomato'...

  convert lena.png -fft \
          \( -clone 0  -draw "fill tomato  color 64,64 point" \) \
          -swap 0 +delete -ift lena_dc_replace.png
[IM Output]

What is really happening is that by changing the DC value in the FFT images you are changing the whole image in that same way. Actually any change in the DC value (the difference) will be added (or subtracted) from each and every pixel in the resulting image. This is how changing the average color of the image gets replicated from the 'frequency domain' back into the 'spatial domain' as we did above.

Unfortunately, just as if we really were really adding some constant to each pixel, the final pixel colors in the reconstructed image could also be clipped by the maximum (white) or minimum (black) limits. As such this is not a recommended method of color tinting an image. But it is simpler than modifying every pixel in the whole image, though the FFT round trip will make it a slower technique).

While the 'phase' of the DC value is not important, it should always be a 'zero' angle (a colour value of 50% gray). If it is not set to 50% gray, the DC value will have a 'unreal' component, and its value modulated by the angle given.

Spectrum Of A Sine Wave Image

Next, lets take a look at the spectrum from a single sine wave image with 4 cycles.


  convert -size 128x129 gradient: -chop 0x1 -rotate 90 -evaluate sine 4 \
          sine4.png
  convert sine4.png -fft +delete \
          -auto-level -evaluate log 100  sine4_spectrum.png
[IM Output] ==> [IM Output]

The unusual creation of the sine wave in the above is necessary to ensure that the resulting wave image tiles correctly. If this is not done the resulting image does not tile perfectly as the left-most and right-most values of the sine wave are duplicate values. This results in undesired harmonics in the resulting FFT image.

An alternative is to generate a Sparse Color Gradient whcih is generated using 'image coordinates', instead of 'pixel coordinates'.

In the spectrum image (enhanced magnitude image) above, we can see that it has 3 dots. The center dot is as before the zero frequencies of average DC value. The other two dots represent the perfect sine wave that the FFT operator found in the image. As the frequency accross the width of the image is 4 cycles, the two frequency pixels are 4 pixels away from the center DC value.

But why two pixels?

Well that is because a single wave can be described in two completely different ways, (one with a negative direction and phase). Both descriptions are mathematically correct, and a fourier transform does not distinguish between them.

If we repeat this with a sine wave with 16 cycles, then again we see that it has 3 dots, but the dots are further apart. In this case the side dots are spaced 16 pixels to the left and right sides of the center dot.

  convert -size 128x129 gradient: -chop 0x1 -rotate 90 -evaluate sine 16 \
          -write sine16.png -fft -delete 1 \
          -auto-level -evaluate log 100 sine16_spectrum.png
[IM Output] ==> [IM Output]

From this you can see that perfect sine waves will be represented simply by two dots in the appropriate position. The distance this position is from the center DC value determine the frequency of the sine wave. the Smaller the wave length the highter the frequency, so the further the dots will be from the DC value.

In fact by dividing size of the image by the frequency (distance of the dots from the center), will give you the wavelength (distance between peaks) of the wave. In the above case: 128 pixels divided by 16 cycles, gives you a wavelength of 8 pixels between each 'band'.

Lets take a closer look at these three 'frequencies' by plotting their original magnitudes (not the logarithmic spectrum).

  convert sine16.png -fft -delete 1  miff:- |\
     im_profile - sine16_magnitude_pf.png
[IM Output]

Notice that the DC value has a value of 1/2 which is to be expected. But that the actual magnitude of the two 16 cycle sine-waves the fourier transform found is only 1/4.

The magnitude of the original sine-save is really 1/2 but the fourier transform divided tha magnitude into two, sharing the results across both plotted frequency waves, so each of the two components only has a magnitude of 1/4.

[IM Output] This duality of positive and negative frequencies in FFT images explains why all FFT image spectrums (such as the Lena spectrum repeated left) is always symetrical about the center. For every dot on one side of the image, you will always get a simular 'dot' mirrored across the center of the image.

The same thing happens with the 'phase' component of FFT image pair, but with a 180 degree shift in value as well.

This explains why the two images actually still only contains exactly the same amount of data as the original image. Each image contains a duplication of the information needed. So each image only contains half the information, but together they hold all the images information.

During generation the FFT algorithm only generates the left half the images. The other half is generated by rotations and merging of the generated image.

When converting Frequency Domain images back to a Spatial Domain Image, the algorithm again only looks at the left half of the image. The right half is completely ignored.

As such when (in later examples) you 'notch filter' a FFT magnitude image, you only really need to filter the left hand side of the magnitude image. You can save yourself some work by also ignoring the right half. However for clarity I will 'notch' both halves.

Generating FFT Images Directly

Now we can use the above information to actually generate a image of a sine wave. All you need to do is create a black and 50% gray image pair, and add 'dots' with the appropriate magnitude, and phase.

For example...

  convert -size 128x128  xc:black \
          -draw 'fill gray50  color 64,64 point' \
          -draw 'fill gray25  color 50,68 point' \
          -draw 'fill gray25  color 78,60 point' \
          generated_magnitude.png
  convert generated_magnitude.png \
          -auto-level -evaluate log 3  generated_spectrum.png
  convert -size 128x128  xc:gray50  generated_phase.png
  convert generated_magnitude.png generated_phase.png \
          -ift  generated_wave.png
[IM Output] [IM Output] ==> [IM Output]

And presto a perfect angled sine wave.

Of course you can only generate perfect sine waves at particular frequences, but that limitation can also work to our advantage. Basically any sine wave generated by a FFT image will be perfectly tilable. That is the left-right and top-bottom edges will exactly matchup. As all the frequencies a FFT image has this properity, making it a useful one.

Unfortunately all the frequencies will also be a power of two in any horizontal or vertical direction, and that is the limitation of this technqiue.

Actually only the first (left most) 'gray25' dot was needed to generate the sine wave as the IFT transform completely ignores the right half of the image as this should simply be a mirror of the left half.

The phase of the DC value must have a 'zero angle' (50% gray color). If you don't ensure that is the case the DC color value will be modulated by its non-zero phase, producing a darker, posibily 'clipped' image.

The phase color of the sine wave components (non-DC pixels) can be anything you like (completely random for example). Again only the phase of the left most dot actually matters. The right hand side is completely ignored.

FUTURE: Perlin Noise Generator using FFT

Spectrum of a Vertical Line

Show the FFT spectrum of a thin and thick line

Demonstrate how small features become 'big' and big features become 'small' in the FFT of the image. Link that back to the sine wave which could be regarded as a 'line' with a single harmonic. Rotate the line

Spectrum of a Rectangle Pattern Image

Next, lets look at the spectrum of white rectangle of width 8 and height 16 inside a black background.


  convert -size 8x16 xc:white -gravity center \
          -gravity center -background black -extent 128x128 rectangle.png
  convert rectangle.png -fft +delete \
          -auto-level -evaluate log 100 rect_spectrum.png
[IM Output] ==> [IM Output]

As you can see the resulting image has a very particular pattern, with a lot of harmonic frequencies. If we graph the center horizontal row and vertical column, we get...

  im_profile -s rect_spectrum.png rect_spectrum_pf.gif

  im_profile -s -v rect_spectrum.png rect_spectrum_pfv.gif
[IM Output] ==> [IM Output]

Now, lets rotate the rectangle by 45 degrees. We find that the spectrum is also rotated in the same direction by 45 degrees.


  convert rectangle.png -rotate 45 -gravity center -extent 128x128 \
          -write rect_rot45.png -fft -delete 1 \
          -auto-level -evaluate log 100 rect_rot45_spectrum.png
[IM Output] ==> [IM Output]

Spectrum Of A Flat Circular Pattern Image

Next, lets look at the spectrum from an image with a white, flat circular pattern, in one case with diameters of 12 (radius 6) and in another case with diameter of 24 (radius 12).


  convert -size 128x128 xc:black -fill white  \
          -draw "circle 64,64 64,70" -write circle6.png -fft -delete 1 \
          -auto-level -evaluate log 100 circle6_spectrum.png

  convert -size 128x128 xc:black -fill white  \
          -draw "circle 64,64 64,76" -write circle12.png -fft -delete 1 \
          -auto-level -evaluate log 100 circle12_spectrum.png
[IM Output] ==> [IM Output]
[IM Output] ==> [IM Output]

Note that the first image is very close to what we generated for the jinc example further above. It is however a little broken up. This artifacting occurs due to the small size of the circle. Since it is represented digitally, its perimeter is not perfectly circular.

The transform of the larger circle is better as its perimeter is a closer approximation of a true circle. We therefore conclude that indeed the transform of the flat circular shape is a jinc function and that the image containing the smaller diameter circle produces transform features that are more spread out and wider.

According to the properties listed above, the distance from the center to the middle of the first dark ring in the spectrum will be 1.22*N/d. When the diameter of the circle is d=12, we get a distance of 1.22*128/12=13. Likewise when the diameter of the circle is d=24, we get a distance of 1.22*128/24=6.5. These are the values that one measures in those images.

Spectrum Of A Gaussian Pattern Image

Next, lets look at the spectrum from two images, each with a white Gaussian circular pattern having sigmas of 8 and 16, respectively


  convert -size 128x128 xc:black -fill white \
          -draw "point 64,64" -gaussian-blur 0x8 -auto-level \
          -write gaus8.png -fft -delete 1 \
          -auto-level -evaluate log 1000 gaus8_spectrum.png

  im_profile -s gaus8.png gaus8_pf.gif
  im_profile -s gaus8_spectrum.png gaus8_spectrum_pf.gif
[IM Output] ==> [IM Output]
[IM Output] ==> [IM Output]


  convert -size 128x128 xc:black -fill white \
          -draw "point 64,64" -gaussian-blur 0x16 -auto-level \
          -write gaus16.png -fft -delete 1 \
          -auto-level -evaluate log 1000 gaus16_spectrum.png

  im_profile -s gaus16.png gaus16_pf.gif
  im_profile -s gaus16_spectrum.png gaus16_spectrum_pf.gif
[IM Output] ==> [IM Output]
[IM Output] ==> [IM Output]

This shows that the transform of the Gaussian circular shape is another Gaussian and that the larger the sigma in the original image, the smaller the sigma will be in the spectrum.

From the properties stated above, we know that the sigma in the spectrum will be just N/(2*sigma), where sigma is from the original image. So for an image of size N and sigma=8, the sigma in the spectrum will be 128/16=8. Similarly if the image's sigma is 16, then the sigma in the spectrum will be 128/32=4.

Spectrum Of A Grid Pattern Image

Next, lets transform an image containing just a set of grid lines spaced 16x8 pixels apart. The resulting spectrum is just an array of dots where the grid lines that are more closely spaced produce dots further apart and vice versa. According to the properties above, since the grid lines are spaced 16x8 pixels apart, then the dots should be spaced N/a=128/16=8 and M/b=128/8=16, which is just what is measured in this image.


  convert -size 16x8 xc:white -fill black \
          -draw "line 0,0 15,0" -draw "line 0,0 0,7" \
          -write mpr:tile +delete \
          -size 128x128 tile:mpr:tile \
          -write grid16x8.png -fft -delete 1 \
          -auto-level -evaluate log 100000 grid16x8_spectrum.png
[IM Output] ==> [IM Output]

 
Under Construction

A list like this is BAD..
Better to demonstrate each properity instead (as done above), which also allows it to be explained better rather than just presenting a raw list.

The following is a list of some of the important properties of the Fourier Transform.

  • High frequencies in the FFT (corresponding to rapidly varying intensities in the original image) lie near the outer parts of the spectrum.
  • Low frequencies in the FFT (corresponding to slowly varying intensities in the original image) lie near the center of the spectrum.
  • The zero frequency (DC) point in the FFT for an NxM original image lies at coordinates N/2,M/2.
  • The intensity value in the magnitude image at the DC point is equal the average graylevel in the original image. (This is a consequence of the scaling of the forward transform by 1/N).
  • Edges in an image give rise to transform components lying along lines perpendicular to the edges.
  • Smaller objects have more spread-out transforms; Larger objects have more compressed transform.
  • If one rotates the image, the transform rotates the same amount.
  • The transform of a uniform object lies along a line perpendicular to the dimension of the object.
  • The transform of a set of grid lines of spacing x=a,y=b in image size NxM is an array of dots: The spacing of the dots in the spectrum will be Dx=N/a and Dy=M/b.
  • The transform of a constant rectangle of dimension x=a,y=b in an image of size NxM is a sinc function: sinc(x*a/N)*sinc(y*b/M), where the sinc(x)=sin(π*x)/(π*x). Here is an example of what a sinc function looks like with a=8 and b=16 and profiles of its center row and column. The scaling factors of plus 0.21 and divide by 1.21 are because the sinc function ranges from -0.21 to 1 and thus has negative values that would otherwise be clipped in the image. Profiles of the log enhanced sinc are also displayed to show how it amplifies the smaller signals relative to the larger ones. (The 3D graphs shown below were not created with ImageMagick, but from an ImageJ plugin.) An important feature of the sinc function is that the spacing between its zeroes is a constant given by Dx=N/(a/2) and Dy=M/(b/2).

    
      N=128
      a=8
      b=16
      convert \( -size ${N}x1 xc: \
              -fx "ax=$a*pi*(i-w/2)/w; ax==0?1:(sin(ax)/ax+0.21)/1.21" -scale 128x128! \) \
              \( -size 1x${N} xc: \
              -fx "by=$b*pi*(j-h/2)/h; by==0?1:(sin(by)/by+0.21)/1.21" -scale 128x128! \) \
              -compose multiply -composite -auto-level -write sinc8x16.png \
              -evaluate log 10 sinc8x16_log10.png
    
      im_profile -s    sinc8x16.png sinc8x16_pf.gif
      im_profile -s -v sinc8x16.png sinc8x16_pfv.gif
      im_profile -s    sinc8x16_log10.png sinc8x16_log10_pf.gif
      im_profile -s -v sinc8x16_log10.png sinc8x16_log10_pfv.gif
    
    [IM Output] [IM Output] [IM Output] [Diagram]
    [IM Output] [IM Output] [IM Output] [Diagram]

    The absolute value of the sinc function is what corresponds to the magnitude of the transform. Since the sinc function has positive and negative lobes, when you take the absolute value, you double the number of oscillations. Therefore, in the spectrum, the spacing of importance is between the middle of successive dark troughs (which were the locations of the zeros before taking the absolute value). This spacing then becomes Dx=N/a and Dy=M/b, due to the doubling of the oscillations. Here is the same example, but with the log of the absolute value. Profiles of the center row and column are shown as well.

    
      N=128
      a=8
      b=16
      convert \( -size ${N}x1 xc: \
              -fx "ax=$a*pi*(i-w/2)/w; ax==0?1:abs(sin(ax)/ax)" -scale 128x128! \) \
              \( -size 1x${N} xc: \
              -fx "by=$b*pi*(j-h/2)/h; by==0?1:abs(sin(by)/by)" -scale 128x128! \) \
              -compose multiply -composite -auto-level \
              -evaluate log 100 sinc8x16abs_log100.png
    
      im_profile -s    sinc8x16abs_log100.png sinc8x16abs_log100_pf.gif
      im_profile -s -v sinc8x16abs_log100.png sinc8x16abs_log100_pfv.gif
    
    [IM Output] [IM Output] [IM Output] [Diagram]

  • The transform of a constant circle of diameter d in an image of size NxN is a jinc function: jinc(r*d/N), where jinc(r)=J1(πr)/(πr) and J1(r) is the Bessel function of the first kind of order one. A jinc function would be similar to a circularly symmetric sinc function, but its side lobes are much more suppressed than in the sinc function. Also in the jinc function, the spacing between successive zeroes is not constant as it was in the sinc function. Therefore we use the spacing from the center to the first zero, which is Dr=1.22*N/d. The factor of 1.22 is identified in Theory Of Remote Image Formation and is the first zero in the Bessel function. Here is an example of what a jinc function looks like with d=12 and also a profile of its center row. The scaling factors of plus 0.06613 and divide by 0.56613 are because the jinc function ranges from -0.06613 to 0.5 and thus has negative values that would otherwise be clipped in the image. Note that the jinc function is very difficult to compute and a series approximation to the Bessel function is used to evaluate it, which was obtained from The Handbook Of Mathematical Functions by Abramowitz and Stegun, formula 9.4.1 and 9.4.3, p369-370. We speed it up by computing it in 1D as a look up table (lut) image and then apply the lut image to a radial-gradient using the -clut option.

    
      N=128
      N2=`convert xc: -format "%[fx:sqrt(2)*$N]" info:`
      d=12
      rad=`convert xc: -format "%[fx:$N/2]" info:`
      rad2=`convert xc: -format "%[fx:sqrt(2)*$rad]" info:`
      fact=`convert xc: -format "%[fx:pi*$d/$N]" info:`
      a0=0.5; a1=-.56249985; a2=.21093573; a3=-.03954289;
      a4=.00443319; a5=-.00031781; a6=.00001109
      b0=.79788456; b1=.00000156; b2=.01659667; b3=.00017105
      b4=-.00249511; b5=.00113653; b6=-.00020033
      c0=-2.35619; c1=.12499612; c2=-.00005650; c3=-.00637879
      c4=.00074348; c5=.00079824; c6=-.00029166
      uu="(xx/3)"
      xxa="($a0+$a1*pow($uu,2)+$a2*pow($uu,4)+$a3*pow($uu,6)"
      xxa="$xxa+$a4*pow($uu,8)+$a5*pow($uu,10)+$a6*pow($uu,12))"
      iuu="(3/xx)"
      vv="($b0+$b1*$iuu+$b2*pow($iuu,2)+$b3*pow($iuu,3)"
      vv="$vv+$b4*pow($iuu,4)+$b5*pow($iuu,5)+$b6*pow($iuu,6))"
      ww="(xx+$c0+$c1*$iuu+$c2*pow($iuu,2)+$c3*pow($iuu,3)"
      ww="$ww+$c4*pow($iuu,4)+$c5*pow($iuu,5)+$c6*pow($iuu,6))"
      xxb="$vv*cos($ww)/(xx*sqrt(xx))"
    
      convert -size 1x${rad2} gradient: -rotate 90 \
              -fx "xx=$fact*i; (xx<=3)?($xxa+0.06613)/0.56613:($xxb+0.06613)/0.56613" \
              jinc12_lut.png
    
      convert \( -size ${N2}x${N2} radial-gradient: -negate \
                 -gravity center -crop ${N}x${N}+0+0 +repage \) \
              jinc12_lut.png -clut -write jinc12.png \
              -auto-level -evaluate log 10 jinc12_log10.png
    
      im_profile -s jinc12.png jinc12_pf.gif
      im_profile -s jinc12_log10.png jinc12_log10_pf.gif
    
    [IM Output] [IM Output] [Diagram]
    [IM Output] [IM Output] [Diagram]

    The absolute value of the jinc function is what corresponds to the magnitude of the transform. Since the jinc function has positive and negative lobes, when you take the absolute value, you double the number of oscillations. Therefore, in the spectrum, the spacing, Dr=1.22*N/d, corresponds to that from the center to the middle of the first dark trough (which used to be the location of the first zero). Here is the same example, but with the log of the absolute value. And we also graph the center row.

    
      N=128
      N2=`convert xc: -format "%[fx:sqrt(2)*$N]" info:`
      d=12
      rad=`convert xc: -format "%[fx:$N/2]" info:`
      rad2=`convert xc: -format "%[fx:sqrt(2)*$rad]" info:`
      fact=`convert xc: -format "%[fx:pi*$d/$N]" info:`
      a0=0.5;    a1=-.56249985;    a2=.21093573; a3=-.03954289;
      a4=.00443319;    a5=-.00031781;    a6=.00001109
      uu="(xx/3)"
      xxa="($a0+$a1*pow($uu,2)+$a2*pow($uu,4)+$a3*pow($uu,6)"
      xxa="$xxa+$a4*pow($uu,8)+$a5*pow($uu,10)+$a6*pow($uu,12))"
      b0=.79788456; b1=.00000156; b2=.01659667;
      b3=.00017105; b4=-.00249511; b5=.00113653; b6=-.00020033
      c0=-2.35619; c1=.12499612; c2=-.00005650;
      c3=-.00637879; c4=.00074348; c5=.00079824; c6=-.00029166
      iuu="(3/xx)"
      vv="($b0+$b1*$iuu+$b2*pow($iuu,2)+$b3*pow($iuu,3)"
      vv="$vv+$b4*pow($iuu,4)+$b5*pow($iuu,5)+$b6*pow($iuu,6))"
      ww="(xx+$c0+$c1*$iuu+$c2*pow($iuu,2)+$c3*pow($iuu,3)"
      ww="$ww+$c4*pow($iuu,4)+$c5*pow($iuu,5)+$c6*pow($iuu,6))"
      xxb="$vv*cos($ww)/(xx*sqrt(xx))"
    
      convert -size 1x${rad2} gradient: -rotate 90 \
              -fx "xx=$fact*i; (xx<=3)?abs($xxa):abs($xxb)" jinc12abs_lut.png
    
      convert \( -size ${N2}x${N2} radial-gradient: -negate \
                 -gravity center -crop ${N}x${N}+0+0 +repage \) \
              jinc12abs_lut.png -clut \
              -auto-level -evaluate log 100 \
              jinc12abs_log100.png
    
      im_profile -s jinc12abs_log100.png jinc12abs_log100_pf.gif
    
    [IM Output] [IM Output] [Diagram]

  • The transform of a Gaussian function of sigma=d in an image size NxN is a Gaussian function. The sigma of the Gaussian function in the spectrum will be sigma=N/(2d). See Fourier Transform Gaussian.
  • The phase looks uninteresting, but contains more information than the magnitude with regard to reconstructing the original image as something visually recognizable.
  • Convolution in the spatial domain is equivalent to multiplication in the frequency domain.
  • Other properties and characteristics are defined mathematically at either http://www.ph.tn.tudelft.nl/Courses/FIP/noframes/fip-Properti-2.html or http://en.wikipedia.org/wiki/Fourier_transform

 

Practical Applications

OK, now that we have covered the basics, what are the practical applications of using the Fourier Transform?

Some of the things that can be done include: 1) increasing or decreasing the contrast of an image, 2) blurring, 3) sharpening, 4) edge detection and 5) noise removal.

Changing The Contrast Of An Image - Coefficient Rooting

One can adjust the contrast in an image by performing the forward Fourier transform, raising the magnitude image to a power and then using that with the phase in the inverse Fourier transform. To increase, the contrast, one uses an exponent slightly less than one and to decrease the contrast, one uses an exponent slightly greater than one. So lets first increase the contrast on the Lena image using an exponent of 0.9 and then decrease the contrast using an exponent of 1.1.


  convert lena.png -fft \
          \( -clone 0 -evaluate pow 0.9 \) -delete 0 \
          +swap -ift lena_plus_contrast.png

  convert lena.png -fft \
          \( -clone 0 -evaluate pow 1.1 \) -delete 0 \
          +swap -ift lena_minus_contrast.png
[IM Output] ==> [IM Output]
[IM Output] ==> [IM Output]

However doing this to the original image would also have the same effect as doing this to the original image. That is a global modification of the magnitudes has the same effect as if you did a global modification of the original image.

Blurring An Image - Low Pass Filtering

One of the most important properties of Fourier Transforms is that convolution in the spatial domain is equivalent to simple multiplication in the frequency domain. In the spatial domain, one uses small, square-sized, simple convolution filters (kernels) to blur an image with the -convole option. This is called a low pass filter. The simplest filter is just a an equally-weighted, square array. That is all the values are ones, which are normalized by dividing by their sum before applying the convolution. This is equivalent to a local or neighborhood average. Another low pass filter is the Gaussian-weighted, circularly shaped filter provided by either -gaussian-blur or -blur.

In the frequency domain, one type of low pass blurring filter is just a constant intensity white circle surrounded by black. This would be similar to a circularly shaped averaging convolution filter in the spatial domain. However, since convolution in the spatial domain is equivalent to multiplication in the frequency domain, all we need do is perform a forward Fourier transform, then multiply the filter with the magnitude image and finally perform the inverse Fourier transform. We note that a small sized convolution filter will correspond to a large circle in the frequency domain. Multiplication is carried out via -composite with a -compose multiply setting.

So lets try doing this with two sizes of circular filters, one of diameter 40 (radius 20) and the other of diameter 28 (radius 14).


  convert -size 128x128 xc:black -fill white \
          -draw "circle 64,64 44,64" circle_r20.png
  convert lena.png -fft \
       \( -clone 0 circle_r20.png -compose multiply -composite \) \
       \( +clone -evaluate log 10000 -write lena_blur_r20_spec.png +delete \) \
       -swap 0 +delete -ift lena_blur_r20.png

  convert -size 128x128 xc:black -fill white \
          -draw "circle 64,64 50,64" circle_r14.png
  convert lena.png -fft \
       \( -clone 0 circle_r14.png -compose multiply -composite \) \
       \( +clone -evaluate log 10000 -write lena_blur_r14_spec.png +delete \) \
       -swap 0 +delete -ift lena_blur_r14.png
[IM Output] ==> [IM Output] x [IM Output] ==> [IM Output] ==> [IM Output]
[IM Output] ==> [IM Output] ==> [IM Output]

So we see that the image that used the smaller diameter filter produced more blurring. We also note the 'ringing' or 'ripple' effect near edges in the resulting images. This occurs because the Fourier Transform of a circle, as we saw earlier, is a jinc function, which has decreasing oscillations as it progresses outward from the center. Here however, the jinc function and the oscillations are in the spatial domain rather than in the frequency domain, as we had demonstrated earlier above.

So what can we do about this? The simplest thing is to taper the edges of the circles using various Windowing Functions. Alternately, one can use a filter such as a Gaussian shape that is already by definition tapered.

So lets do the latter and use two Gaussian blurred circles to remove most of the sever 'ringing' effects.

  convert circle_r20.png -blur 0x4 -auto-level gaussian_r20.png
  convert lena.png -fft \
       \( -clone 0 gaussian_r20.png -compose multiply -composite \) \
       \( +clone -evaluate log 10000 -write lena_gblur_r20_spec.png +delete \) \
       -swap 0 +delete -ift lena_gblur_r20.png

  convert circle_r14.png -blur 0x4 -auto-level gaussian_r14.png
  convert lena.png -fft \
       \( -clone 0 gaussian_r14.png -compose multiply -composite \) \
       \( +clone -evaluate log 10000 -write lena_gblur_r14_spec.png +delete \) \
       -swap 0 +delete -ift lena_gblur_r14.png
[IM Output] ==> [IM Output] x [IM Output] ==> [IM Output] ==> [IM Output]
[IM Output] ==> [IM Output] ==> [IM Output]

This of course is much better.

The ideal low-pass filter is not to blur circles at all, but actually use a proper gaussian curve of sigma rather than a radius.

Of course in this example we ended up doing a blur, to do a blur! However the blur pattern that is multiplied against the FFT magnitude image used is fixed, and could in fact be retrieved from a pre-generated cache. Also the multiplying image does not need to be the full size of the original image, you can use a smaller image. As such the above can be a lot faster for large images, and in the case of handling lots of images.

The more important point is for large strong blurs, the frequency domain image is small, and only does a single multiply, rather then having to average lots of pixels, for each and every pixel in the original image.

For small sized blurs you may be better with the more direct convolution blur.

Detecting Edges In An Image - High Pass Filtering

In the spatial domain, high pass filters that extract edges from an image are often implemented as convolutions with positive and negative weights such that they sum to zero.

Things are much simpler in the frequency domain. Here a high pass filter is just the negated version of the low pass filter. That is where the low pass filter is bright, the high pass filter is dark and vice versa. So in ImageMagick, all we need do is to -negate the low pass filter image.

So lets apply high pass filters to the Lena image using a circle image. And then again using a purely gaussian curve.


  convert circle_r14.png -negate circle_r14i.png
  convert lena.png -fft \
       \( -clone 0 circle_r14i.png -compose multiply -composite \) \
       \( +clone -evaluate log 10000 -write lena_edge_r14_spec.png +delete \) \
       -delete 0 +swap -ift -normalize lena_edge_r14.png

  convert -size 128x128 xc: -draw "point 64,64" -blur 0x14 \
          -auto-level   gaussian_s14i.png
  convert lena.png -fft \
       \( -clone 0 gaussian_s14i.png -compose multiply -composite \) \
       \( +clone -evaluate log 10000 -write lena_edge_s14_spec.png +delete \) \
       -delete 0 +swap -ift -normalize lena_edge_s14.png
[IM Output] ==> [IM Output] x [IM Output] ==> [IM Output] ==> [IM Output]
[IM Output] ==> [IM Output] ==> [IM Output]

Carefully examining these two results, we see that the simple circle is not quite as good as the gaussian, as it has 'ringing' artifacts and is not quite as sharp.

Sharpening An Image - High Boost Filtering

The simplest way to sharpen an image is to high pass filter it (without the normalization stretch) and then blend it with the original image.

  convert lena.png -fft \
       \( -size 128x128 xc: -draw "point 64,64" -blur 0x14 -auto-level \
          -clone 0 -compose multiply -composite \) \
       -delete 0 +swap -ift \
       lena.png -compose blend -set option:compose:args 100x100 -composite \
       lena_sharp14.png
[IM Output] ==> [IM Output]

Here a high pass filter, is done in the frequency domain and the result transformed back to the spatial domain where it is blended with the original image, to enhance the edges of the image.

Noise Removal - Notch Filtering

Many noisy images contain some kind of patterned noise. This kind of noise is easy to remove in the frequency domain as the patterns show up as either a pattern of a few dots or lines. Recall a simple sine wave is a repeated pattern and shows up as only 3 dots in the spectrum.

In order to remove this noise, one simply, but unfortunately, has to manually mask (or notch) out the dots or lines in the magnitude image. We do this by transforming to the frequency domain, create a grayscale version of the spectrum, mask the dots or lines, threshold it, multiply the binary mask image with the magnitude image and then transform back to the spatial domain.

Lets try this on the clown image, which contains a diagonally striped dither-like pattern. First we transform the clown image to create its magnitude and phase images.


  convert clown.jpg -fft \
          \( +clone  -write clown_phase.png +delete \) +delete \
          -write clown_magnitude.png  -colorspace gray \
          -auto-level -evaluate log 100000  clown_spectrum.png
[IM Output]
Original
==> [IM Output]
Spectrum
[IM Output]
Phase

We see that the spectrum contains four bright star-like dots, one in each quadrant. These unusal points represent the pattern in the image we want to get rid of.

The bright dot and lines in the middle of the image are of no concern as they represent the DC (average image color), and effects of the edges from the image, and should not be modified.

Note that when generating the spectrum image I forced the resulting image to be a pure gray-scale image. This is so I can now loaded the image into an editor, and using any non-gray color (such as red), I masked out the area of those 4 star like patterns.

When finished editing I can extract the areas I colored, like this...

  convert clown_spectrum_edited.png clown_spectrum.png \
          -compose difference -composite \
          -threshold 0 -negate clown_spectrum_mask.png
[IM Output] ==> [IM Output]

Now we simply multiply the mask with the magnitude and use the result with the original phase image to transform back to the spatial domain. We display the original image next to it for comparison


  convert clown_magnitude.png clown_spectrum_mask.png \
          -compose multiply -composite \
          clown_phase.png -ift clown_filtered.png
[IM Output] ==> [IM Output]

A very good result. But we can do even better.

As you saw in the previous examples, simple 'circles' are not particularly friendly to a FFT image, so lets blur the mask slightly...

  convert clown_spectrum_mask.png \
          -blur 0x5 -level 50x100%  clown_mask_blurred.png
[IM Output]

And filter the clown, this time re-generating the FFT images in memory.

  convert clown.jpg -fft \
          \( -clone 0 clown_mask_blurred.png -compose multiply -composite \) \
          -swap 0 +delete -ift clown_filtered_2.png
[IM Output]

A simply amazing result! And one that could posibly be improved further by adjusting that mask to fit the 'star' shapes better.

We can even take the difference between the original and the result to create an image of the areas where noise was removed.

  convert clown.jpg clown_filtered_2.png -compose difference \
          -composite -normalize clown_noise.png
[IM Output]

Lets try this on an another example. This time on a "Twigs" image found on the RoboRealm website, which contains an irregular pattern of horizontal and vertical stripes.

Again we extract a grey-scale spectrum image, just as we did before.

  convert twigs.jpg -fft +delete -colorspace gray \
          -auto-level -evaluate log 100000 twigs_spectrum.png
[IM Output] ==> [IM Output]

In this case, as the noise in the image is horizontally and vertically oriented, this shows up as thick horizontal and vertical bands along the center lines but not in the actual center of the image.

Again we mask out the parts using a image editor, this time using a 'blue' color (it really doesn't matter which color is used)...

  convert twigs_spectrum_edited.png twigs_spectrum.png \
          -compose difference -composite \
          -threshold 0 -negate twigs_spectrum_mask.png
[IM Output] ==> [IM Output]

Now we again multiply the mask with the FFT magnitude image, and reconstruct the image.

  convert twigs.jpg -fft \
          \( -clone 0 twigs_spectrum_mask.png -compose multiply -composite \) \
          -swap 0 +delete  -ift twigs_filtered.png
[IM Output] ==> [IM Output]

And we can take the difference between the original and the result to create an image of the areas where noise was removed.


  convert twigs.jpg twigs_filtered.png -compose difference -composite \
          -normalize twigs_noise.png
[IM Output]

Advanced Applications

Some of the other more advanced applications of using the Fourier Transform include: 1) deconvolution (deblurring) of motion blurred and defocused images and 2) normalized cross correlation to find where a small image best matches within a larger image.

Coming Soon!

Examples of FFT Multiplication moved to a sub-directory

Created: 22 July 2000
Updated: 13 August 2009
Author: Fred Weinhaus, <fmw at alink dot net> with editing and formating by Anthony Thyssen, <A.Thyssen@griffith.edu.au>
Examples Generated with: [version image]
URL: http://www.imagemagick.org/Usage/fourier/