The FFT on a GPU

Kenneth Moreland1 and Edward Angel2

1 Sandia National Laboratories, Albuquerque, NM, USA

2 Department of Computer Science, University of New Mexico, Albuquerque, NM, USA

 

M. Doggett, W. Heidrich, W. Mark, A. Schilling (Editors)

Graphics Hardware (2003)

 

Источник: http://www.cs.unm.edu/~kmorel/documents/fftgpu/fftgpu.pdf

 

 

Abstract

 

The Fourier transform is a well known and widely used tool in many scientific and engineering fields. The Fourier transform is essential for many image processing techniques, including _ltering, manipulation, correction, and compression. As such, the computer graphics community could benefit greatly from such a tool if it were part of the graphics pipeline. As of late, computer graphics hardware has become amazingly cheap, powerful, and exible. This paper describes how to utilize the current generation of cards to perform the fast Fourier transform (FFT) directly on the cards. We demonstrate a system that can synthesize an image by conventional means, perform the FFT, filter the image, and finally apply the inverse FFT in well under 1 second for a 512 by 512 image. This work paves the way for performing complicated, real-time image processing as part of the rendering pipeline. Categories and Subject Descriptors (according to ACM CCS): I.3.3 [Computer Graphics]: Bitmap and frame buffer operations I.4.3 [Image Processing and Computer Vision]: Filtering

 

1. Introduction

 

Scientists and engineers have been using digital signal processing to manipulate images since the mid 1960's1. Over the past 40 years a wide variety of techniques have been developed to help enhance images for better human visual perception and autonomous machine perception. However, very little of this technology has been applied to real time computer synthesized graphics. The high computational intensity of such operations has been prohibitive. The graphics hardware that generated these images was, until very recently, not exible enough to perform any but the simplest image filltering. Also, moving the image data to and from another processing unit, such as the CPU, is in itself a serious bottleneck. Fortunately, the latest releases of graphics hardware now have the power and exibility to perform even the most complicated image filtering.

Digital image processing algorithms should be a good fit for modern GPU hardware. Any digital image processing technique entails a repetitive operation on the pixels of an image. Graphics processors are designed to perform a block of operations on groups of vertices or pixels, and they do this very efficiently. Furthermore, the performance growth of graphics hardware

has been exceeding Moore's law, with an annual increase in performance rate of over 2.4. In addition, the functionality of these cards has also increased dramatically. Nearly every component on the GPU can now be reprogrammed. A GPU is no longer a fixed pipeline but is now better described as a SIMD parallel processor2 or a streaming processor3. Furthermore, these processors can now perform computations on full 32 bit floating point values as opposed to 8 bit fixed values of one generation ago.

Fourier domain processing is not currently done for real-time graphics synthesis because performing transforms on a CPU requires data to be moved to and from the graphics card, a serious bottleneck. Furthermore, commodity graphics hardware was not powerful enough to perform the FFT necessary for complicated image processing. However, the current generation of graphics cards have the power, programmability, and floating point precision required to perform the FFT efficiently.

This paper describes how we used a commodity graphics card to perform the FFT and filter images. We start by providing an overview of the FFT algorithm to facilitate understanding of our implementation. We will then discuss some properties of real Fourier transforms that allow us to more efficiently implement the FFT on a graphics card. Finally, we will discuss the details of our implementation.