The naive implementation of an $N$-point DFT is basically a multiplication by a $N \times N$ matrix. This results in a complexity of $\mathcal{O}(N^2)$.

One of the most common Fast Fourier Transform (FFT) algorithm is the **radix-2** algorithm. This is a basic divide and conquer approach. 

First define the *"twiddle factor"* as:
$$W_N \triangleq e^{-j\frac{2\pi}{N}}$$
where $j \triangleq \sqrt{-1}$ is the imaginary unit, then the DFT $X[k]$ of $x[n]$ is given by
$$X[k] = \sum_{n=0}^{N-1} x[n] \, W_N^{kn} \, .$$
The sum can then be divided in two sums as follows
$$X[k] = \sum_{n=0}^{N-1} x[2n]W_N^{2kn} + \sum_{n=0}^{N-1} x[2n+1]W_N^{k(2n+1)}$$
where the first summation deals with the even samples of $x[n]$ and the second with the odd samples of $x[n]$. Defining $x_e[n] \triangleq x[2n]$ and $x_o[n] \triangleq x[2n+1]$ and using the fact that

 1. $W_N^{k(2n+1)} = W_N^{2kn}W_N^k$, and
 2. $W_N^{2kn} = W_{N/2}^{kn}$,

this can be re-written as
$$
\begin{align}
X[k] &= \sum_{n=0}^{N/2-1} x_e[n] W_{N/2}^{kn} + W_N^k\sum_{n=0}^{N/2-1} x_o[n]W_{N/2}^{kn} \\
& = X_e[k] + W_N^k X_o[k]
\end{align}
$$
where $X_e[k]$ and $X_o[k]$ are the $\tfrac{N}{2}$-point DFT transforms of the even and odd samples of $x[n]$ respectively. So we just transformed a single $N$-point DFT into two smaller $\tfrac{N}{2}$-point DFTs. This reduces computational cost because
$$ \left( \frac{N}{2} \right)^2 + \frac{N}{2} < N^2 $$
as $N$ gets large.

We can then re-iterate the same process on those two smaller DFTs. This divide-and-conquer approach allows to reach complexity of $\mathcal{O}(N\log N)$, which is **way better** than the $\mathcal{O}(N^2)$ we had with the naive DFT implementation (as is greatly illustrated by *leftaroundabout*'s answer).