How can a quadratic algorithm be faster than a linearithmic one?

Question

I have to solve the following problem:

Al and Bob are arguing about their algorithms. Al claims his $O(n\log n)$ time method is always faster than Bob’s $O(n^2)$ time method. To settle the issue, they perform a set of experiments. To Al’s dismay, they find that if $n<100$, the $O(n^2)$ time algorithm runs faster, and only when $n \geq 100$ is the $O(n\log n)$ time one better. Explain how this is possible.

Watching the Big O definition I can see that $f(x)$ is $O(g(x))$ if there are constants $C$ and $k$ such that $|f(x)|\leq C|g(x)|$ whenever $x>k$. Also, I know that $O(n^2)$ comes from a polynomial of degree 2 that might have a coefficient and other terms of lower degree.

I do not find how to relate this facts in order to make a conclusion although I suspect that constants $C$ and $k$ could play an important role in such a conclusion.

I will very much appreciate any clue so I can get a better idea about how to formulate an answer.

@AndrejBauer: Yes, it is. en.wikipedia.org/wiki/Time_complexity#Linearithmic_time — Jim Mischel
– Jim Mischel, Commented Jul 14, 2015 at 15:22
Note that Big-O is an upper bound. A O(n) algorithm is automatically O(n^2) as well. \Theta(n^2) is more useful. — gnasher729
– gnasher729, Commented Aug 30, 2024 at 8:17

Yuval Filmus · Accepted Answer · 2015-07-10 04:24:19Z

All you know is that an $O(n\log n)$ algorithm is eventually (a lot) faster than an $O(n^2)$ algorithm, but for any specific value of $n$, you can't say anything. Indeed, the asymptotic growth of the function doesn't depend on its value on $n < 100$. This means that if $f(n) = O(n^2)$ then for all $c_1,\ldots,c_{99}$, the following function is also $O(n^2)$: $$ f'(n) = \begin{cases} c_n & \text{if } n < 100, \\ f(n) & \text{if } n \geq 100. \end{cases} $$ So asymptotic notation is completely uninformative regarding the performance of algorithms on specific values of $n$.

You could say that functions such as $f'(n)$ are artificial and never really occur in practice. This is not quite true (you can hard-code some auxiliary information for small $n$, for example), but even if you consider "natural" functions, asymptotic notation doesn't help you to determine anything other than asymptotics: consider for example $n^2$ against $10^{100} n\log n$ (such examples definitely happen in practice, for example in fast matrix multiplication).

"such examples definitely happen in practice, for example in fast matrix multiplication" -- can you give an example of galactic algorithms used in practice? (Or was the scope of "practice" you were aiming for "in TCS research practice"?) — Raphael
– Raphael, Commented Jul 10, 2015 at 9:35
Perhaps simplex vs. ellipsoid algorithm would be a better-known example? — Klaus Draeger
– Klaus Draeger, Commented Jul 10, 2015 at 11:16
@Raphael These algorithms tend not to be used in practice, for obvious reasons. But they are not designed with this behavior – having unrealistic constants – in mind. — Yuval Filmus
– Yuval Filmus, Commented Jul 10, 2015 at 13:17

Jim Mischel · Accepted Answer · 2015-07-14 15:24:08Z

Remember, that all asymptotic analysis tells us is how the running time varies with the size of the input for a particular algorithm. It doesn't tell us how many seconds it will take for an implementation of the algorithm to complete with a given input size. But we do know that if an O(n²)) algorithm takes X seconds to process a set of items, it will take approximately 100 times as long to process 10 times as many items.

Asymptotic notation eliminates constant factors because they're not really relevant to how the running time grows with the size of input. When you're talking about using an O(n²) algorithm on 1,000,000 items, it's unlikely that your constant factor is going to make much difference; the n^2 term is going to dwarf it.

You can, however, estimate the constant factors. One way is by counting "basic operations" to come up with a rough estimate of how much work is done per iteration. Or, you could run both algorithms with some sample input, count the iterations, and figure out how many microseconds it takes per iteration. Either way, you'll come up with a value that is the constant for that particular algorithm.

So, say you want to compare insertion sort (an O(n²) algorithm with Quick sort (O(n log n)). We know from empirical evidence that insertion sort is faster than Quick sort when n is small (how small is another question). That's why optimized Quick sort implementations switch to insertion sort when the subarray is less than 10 or so items.

Using one of the techniques I described above, you determine the constants for each of the algorithms. Let's call them C_i and C_q. Given that, we can estimate that insertion sort will be faster than Quick sort when (C_i * n²) < (C_q*(n log n)).

It should be obvious from looking at the two algorithms that C_i < C_q. The insertion sort is incredibly simple. The algorithm is nothing but comparison and swap, with a little loop overhead thrown in.

Quicksort is a little bit more complex, requiring more steps per iteration, but fewer iterations.

Consider sorting a five element array. Insertion sort will do, at worst:

5 increments and comparisons of the outer loop control variable
15 increments and comparisons of the inner loop control variable
15 element comparisons
15 swaps

Now look at Quicksort, which in the average case has to partition four subarrays. The 5-element array gets split into two subarrays of 3 and 2 elements. The 3-element subarray is further partitioned into subarrays of 1 and 2 elements. And then the two 2-element subarrays are partitioned.

So the partition method will be called four times. Each partition step requires a minimum of two swaps in addition to the comparison and swapping of elements, and other overhead. When you add it all up, you see that Quicksort does more work per iteration. When the number of iterations is small, Insertion sort does less total work even though it does more iterations.

"all asymptotic analysis tells us is how the running time varies with the size of the input for a particular algorithm. It doesn't say anything about the actual running time." -- this seems self-contradicting. Also, you seem to get muddled between the points "asymptotics don't tell the whole story" and "we don't actually analyse runtime". Valid points both, but best kept separate. — Raphael
– Raphael, Commented Jul 13, 2015 at 21:14
@Raphael: You're right. I think my modified answer clarifies the seeming contradiction, and separates the theoretical (asymptotic) from the real-world running time. — Jim Mischel
– Jim Mischel, Commented Jul 13, 2015 at 23:45
Better, thanks. I still think your answer better fits our related reference questions (1, 2). But then, the question is maybe the same as there. We'll see which issue the OP stumbled over. — Raphael
– Raphael, Commented Jul 14, 2015 at 5:33

vzn · Accepted Answer · 2015-07-14 18:09:27Z

the question and answers have a lot of notation. how about a picture, "worth a lot of words" and some zen-like visual/ geometric intuition? this is a simple graph of $x \log(x)$ (red) vs $x^2/15$ (green). rest of explanation (of the graph vs big O notation, etc) left as exercise to reader (eg also solve for exact region where one is supposedly "faster" than another etc, and reformulate the functions slightly so that it matches the constants $C,k$ in the question, etc).

(note, apparently not widely appreciated, big-O concepts are conceptually very similar to calculus limit concepts eg defn of derivative or $\epsilon/ \delta$ rule & ideally some calculus classes would incl discussion of basics of big-O, and also note connection of concept of intercepts in algebra/ geometry for determining special constants of the eqns etc) — vzn
– vzn, Commented Jul 14, 2015 at 18:12

Agnius Vasiliauskas · Accepted Answer · 2019-02-20 12:38:23Z

Solving $c\ n^2 = k\ n \log(n)$ for n, gives us a "point of no return" - value of n from where going upwards into n domain $O(n\log n)$ always wins : $n_{critical} = e^{-W(-c\ / \ k)}$.
W is Lambert W function ( or so called product logarithm ).

BTW it's worth to say that if $k < e\ c$, then there is no intersection point between those two curve types (no solution) , thus in this case small values of n where $O(n^2)$ wins - doesn't exist

So in general - region of n where $O(n^2)$ wins, depends on constants $c,k$. And these are directly related to operations amount/cost in loops of both type of algorithms. Bigger values of c,k indicates that some loop operations costs more for a CPU. Thus theoretically fast algorithm can be amortized by heavy operations for CPU.

gnasher729 · Accepted Answer · 2024-08-30 08:53:46Z

As an example: I have two algorithms. Algorithm A precalculates some constants in a very expensive way, taking 1000 seconds, independent of n and the actual input values. Then it takes n log n nanoseconds to find a solution. Algorithm B always takes n^2 nanoseconds.

A takes at least 1000 seconds. B takes at most a millisecond for n <= 1,000, 0.1 seconds for n <= 10,000, 10 seconds for n <= 100,000, 1,000 seconds for n <= 1,000,000. At that point A’s n log n nanoseconds are just 0.02 seconds. For n >= 1,001,000, A is faster. For n = 1,000,000,000, A takes 1,030 seconds, B takes 30 years. For n = 1,000,000,000,000, A takes 5,000 seconds, B takes 30 million years.

So for small n, B was a lot faster. Then around n = 1,000,000, things turn around very quickly.

Note that n log n vs. n^2 is a massive difference. A factor log log n will in practice always be a small number, so saving a factor log log n using a more complex algorithm may never run faster in practice.

Stack Exchange Network

How can a quadratic algorithm be faster than a linearithmic one?

5 Answers 5

Linked

Hot Network Questions

How can a quadratic algorithm be faster than a linearithmic one?

5 Answers 5

Linked

Related

Hot Network Questions