The Great FFT

Polynomials
Representing Polynomials
- Coefficient Representation
- Point-Value Representation
Operations on Polynomials
Complexity Analysis
Converting Between Representations
- The Key Insight
Divide and Conquer Strategy
Runtime Analysis
Looking Forward
References

Every time you make a phone call, compress an image, or use noise-canceling headphones, you're benefiting from one algorithm: the Fast Fourier Transform. If you're in software, you've probably wondered: What are the coolest algorithms ever discovered? As a fun exploration, I decided to understand SIAM's top 10 algorithms of the 20th century.

The Fast Fourier Transform (FFT) algorithm is revolutionary. Its applications touch nearly every area of engineering. The Cooley-Tukey paper rediscovered (it was originally found in Gauss's notes for astronomical calculations!) and popularized FFT. It remains one of the most widely cited papers in science and engineering.

FFT is something I've used extensively but never fully understood. I always thought of it as something that makes the Fourier transform faster for viewing time-domain signals in frequency domain. While that's one application, the key insight is much broader: FFT is fundamentally about making polynomial operations blazingly fast.

Polynomials

A polynomial is an expression of the form:

\[ A(x) = a_0 + a_1x + a_2x^2 + a_3x^3 + \ldots + a_{n-1}x^{n-1} \]

where \(a_i\) are coefficients (typically real numbers) and \(x\) is a variable. This polynomial \(A(x)\) has degree \(n-1\) (one less than the number of terms).

Representing Polynomials

How can we represent polynomials in a computer? There are two fundamental approaches:

Coefficient Representation

\[ (a_0, a_1, a_2, \ldots, a_{n-1}) \]

Simple—it's just a vector of numbers! This representation handles any one-dimensional data naturally. We can write a function \(A(x)\) to evaluate the polynomial, or simply work with the coefficients as a vector.

Point-Value Representation

\[(x_0, y_0), (x_1, y_1), (x_2, y_2), \ldots, (x_{n-1}, y_{n-1})\]

Since a polynomial \(A(x)\) is a function mapping \(x\) to \(y\), we can represent it using input-output pairs. But do we need to define \(A(x)\) at all possible values? No! There's a fundamental property:

Given \(n\) pairs \((x_0,y_0), (x_1,y_1), \ldots, (x_{n-1},y_{n-1})\) where all \(x_i\)'s are distinct, there exists a unique polynomial \(p(x)\) of degree at most \(n-1\) such that \(p(x_i) = y_i\) for all \(i\).

Intuitively, this makes sense. A line requires 2 points, a parabola needs 3 points, and so on. A polynomial of degree \(n-1\) is uniquely determined by \(n\) point-value pairs. You can find a proof here.

Operations on Polynomials

What can we do with polynomials?

Evaluation

Given polynomial \(p\) and number \(x\), compute \(p(x)\).

Addition

Given polynomials \(p(x)\) and \(q(x)\), find polynomial \(r = p + q\) such that \(r(x) = p(x) + q(x)\) for all \(x\). If both have degree \(n-1\), the sum also has degree at most \(n-1\).

Multiplication

Given polynomials \(p(x)\) and \(q(x)\), find polynomial \(r = p \cdot q\) such that \(r(x) = p(x) \cdot q(x)\) for all \(x\). If both have degree \(n-1\), the product has degree \(2n-2\).

Complexity Analysis

Let's analyze the time complexity for each representation:

Coefficient Representation

Evaluation: Simple loop achieves \(O(n)\) operations (can be optimized further with Horner's method).

Addition: Element-wise addition in \(O(n)\) operations.

Multiplication: This is the bottleneck—requires \(O(n^2)\) operations using the standard algorithm. While advanced techniques exist, they have large constants.

Point-Value Representation

Addition: Given polynomials \((x_i, y_i)\) and \((x_i, z_i)\), simply compute \((x_i, y_i + z_i)\). This takes \(O(n)\) operations.

Note: Both polynomials must be evaluated at the same points, otherwise we need Lagrange interpolation.

Multiplication: Given polynomials \((x_i, y_i)\) and \((x_i, z_i)\), compute \((x_i, y_i \cdot z_i)\). This takes \(O(n)\) operations.

Note: Multiplying two degree-\((n-1)\) polynomials yields a degree-\((2n-2)\) polynomial, so we need more sample points. We handle this by evaluating both input polynomials at \(2n-1\) points before multiplication.

Evaluation: To evaluate at a new point requires interpolation, which takes \(O(n^2)\) operations.

Summary

Representation	Multiplication	Evaluation	Addition
Coefficient	\(O(n^2)\)	\(O(n)\)	\(O(n)\)
Point-Value	\(O(n)\)	\(O(n^2)\)	\(O(n)\)

The question becomes: Can we convert between representations efficiently to get the best of both worlds? This is exactly where FFT shines.

Converting Between Representations

For polynomial \(A(x) = a_0 + a_1x + \ldots + a_{n-1}x^{n-1}\), converting from coefficient to point-value representation at \(n\) distinct points \((x_0, x_1, \ldots, x_{n-1})\) can be expressed as:

\begin{bmatrix} y_0 \\ y_1 \\ \vdots \\ y_{n-1} \end{bmatrix} = \begin{bmatrix} 1 & x_0 & x_0^2 & \cdots & x_0^{n-1} \\ 1 & x_1 & x_1^2 & \cdots & x_1^{n-1} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_{n-1} & x_{n-1}^2 & \cdots & x_{n-1}^{n-1} \end{bmatrix} \begin{bmatrix} a_0 \\ a_1 \\ \vdots \\ a_{n-1} \end{bmatrix}

This is the Vandermonde matrix \(V\). The conversion is \(\vec{y} = V\vec{a}\) and requires \(O(n^2)\) operations.

To convert back from point-value to coefficient representation, we compute \(\vec{a} = V^{-1}\vec{y}\), which also takes \(O(n^2)\) operations.

Fact: \(V\) is invertible when all \(x_i\)'s are distinct.

The Key Insight

Here's the crucial observation: we get to choose the sample points \(x_i\) in matrix \(V\)! If we pick these values with the right structure, maybe this conversion can be much faster. We're trading generality for speed, but we don't care—we still get efficient conversion between representations.

FFT exploits this freedom of choice.

Divide and Conquer Strategy

Our Goal

Let's be crystal clear about what we want:

We have polynomial \(A(x)\) in coefficient form \(\langle a_0, a_1, \ldots, a_{n-1} \rangle\)
We have an input set \(X\) of \(n\) points
We need to compute \(\langle y_0, y_1, \ldots, y_{n-1} \rangle\) where \(y_i = A(x_i)\) for all \(x_i \in X\)
The key insight: we're free to choose the input set \(X\)

The essence of divide and conquer is:

Divide: Split problem into smaller subproblems
Conquer: Recursively solve subproblems
Combine: Merge solutions into solution for original problem

Divide and Conquer for Polynomials

Consider polynomial \(A(x) = a_0 + a_1x + a_2x^2 + \ldots + a_7x^7\).

Divide: Express \(A(x)\) as sum of odd and even powers: \[ A_{\text{even}}(x) = a_0 + a_2x + a_4x^2 + a_6x^3 \] \[ A_{\text{odd}}(x) = a_1 + a_3x + a_5x^2 + a_7x^3 \]

Conquer: Recursively compute \(A_{\text{even}}(z)\) and \(A_{\text{odd}}(z)\) for all \(z \in \{x^2 | x \in X\}\)

Combine: Use the identity: \[ A(x) = A_{\text{even}}(x^2) + x \cdot A_{\text{odd}}(x^2) \]

This requires only one multiplication and one addition—\(O(1)\) operations!

Base case: Stop when polynomial has degree 0 (one coefficient).

Note: We evaluate both \(A_{\text{even}}\) and \(A_{\text{odd}}\) at \(x^2\). This is crucial for the math to work out.

Making Subproblems Smaller

For divide and conquer to be effective, subproblems must be smaller than the original. We need the set \(Z = \{x^2 | x \in X\}\) to have fewer elements than \(X\).

Can we find a set of \(n\) points such that squaring each element produces a smaller set?

For \(n = 2\): If \(X = \{a, -a\}\), then \(Z = \{a^2\}\) has size 1. This works!

For \(n = 4\): We need 4 points that become 2 points when squared. This is where complex numbers become essential.

Enter Complex Numbers

Think of complex numbers as rotations in the plane. The key insight from Kalid's excellent explanation is that complex numbers represent rotations around the origin.

"Comple Numbers as Rotations"

Let's build our collapsing sets:

\(S_1 = \{1\}\) (size 1)
\(S_2 = \{1, -1\}\) (size 2, squares to \(S_1\))
\(S_4 = \{1, -1, i, -i\}\) (size 4, squares to \(S_2\))
\(S_8 = \{1, -1, i, -i, \frac{1+i}{\sqrt{2}}, \frac{-1-i}{\sqrt{2}}, \frac{i-1}{\sqrt{2}}, \frac{-i+1}{\sqrt{2}}\}\) (size 8, squares to \(S_4\))

We can continue this pattern indefinitely! For any power of 2, we can find a set that "collapses" by half when squared.

These special sets are called the nth roots of unity —they're the \(n\) complex numbers that, when raised to the \(nth\) power, equal 1.

Runtime Analysis

Let's analyze our divide-and-conquer algorithm:

Take polynomial \(A(x)\) to be evaluated on set \(X\) with \(|X| = n\)
Reduce to two subproblems: \(A_{\text{even}}\) and \(A_{\text{odd}}\), both evaluated on set \(Y = \{x^2 | x \in X\}\) with \(|Y| = n/2\)
Combine solutions in \(O(n)\) time

The recurrence relation is: \[T(n) = 2T(n/2) + O(n)\]

Solving this recurrence, we get \(T(n) = O(n \log n)\).

Remarkable! We can now perform polynomial addition, evaluation, and multiplication in \(O(n \log n)\) time by:

Converting to point-value representation using FFT (\(O(n \log n)\))
Performing the operation (\(O(n)\))
Converting back to coefficient representation using inverse FFT (\(O(n \log n)\))

This elegant divide-and-conquer algorithm is the essence of the Fast Fourier Transform!

Looking Forward

This explanation focused on the core algorithmic insight behind FFT. In future posts, we'll explore:

Implementation details and the role of \(e^{i\theta}\)
How this connects to traditional Fourier analysis
Practical applications in signal processing, image compression, and beyond

The beauty of FFT lies not just in its efficiency, but in how it reveals deep connections between seemingly different mathematical concepts: polynomials, complex numbers, and harmonic analysis.

The Great FFT

Table of Contents

Polynomials

Representing Polynomials

Coefficient Representation

Point-Value Representation

Operations on Polynomials

Evaluation

Addition

Multiplication

Complexity Analysis

Coefficient Representation

Point-Value Representation

Summary

Converting Between Representations

The Key Insight

Divide and Conquer Strategy

Our Goal

Divide and Conquer for Polynomials

Making Subproblems Smaller

Enter Complex Numbers

Runtime Analysis

Looking Forward

References