8

Today in a interview I have got the question asking which sort you use for multi threaded application.Weather it is a merge sort or quick sort.

3
  • 1
    Does the sort have to be multithreaded or can sorting be done on one tread and all other stuff - elsewhere? Commented Mar 21, 2011 at 7:43
  • 1
    @sharptooth: In the latter case it would be more of a trick-question, wouldn't it? Commented Mar 21, 2011 at 7:46
  • For practice I implemented a threaded sort in C++ using the quicksort algorithm. You can find the source here: github.com/cmcantalupo/sorter_threaded Commented Jul 13, 2011 at 17:04

5 Answers 5

8

You use merge sort for multi-threaded applications.

The reason:

Merge sort divides the problem into separate smaller problems (smaller arrays) and then merges them. That can be done in separate threads.

Quick sort does a pivot sort on a single array, so it's harder to divide the problem efficiently between threads.

Sign up to request clarification or add additional context in comments.

3 Comments

Both can be effectively implemented multithreaded, the division of tasks will be simpler in mergesort, but you can use quicksort by obtaining a pivot, applying the first iteration, after which all elements smaller than pivot are placed before pivot in the array and all greater are placed after the pivot. Then use two threads to order both sides of the array... Merge sort is still simpler to implement and the task division will be more even so the potential gains are higher (if you can spare a bit of memory).
Right, it's not that you can't it's just harder
This answer isn’t correct. Both merging in merge sort and dividing in quicksort can be done in parallel on independent data. Only the first divide operation and the last merge operation would then have to operate on the same data. So the problem is actually the same for quicksort and merge sort. In fact, parallelising quicksort is much much easier than parallelising merge sort if you use the simple schema in my answer.
4

Every divide and conquer algorithm can be quite easily parallelised. Merge sort and quicksort both follow the same basic schema which can be run in parallel:

procedure DivideAndConquer(X) if X is a base case then Process base case X return Divide X into [Y0 … Yn[ for Y ∈ [Y0 … Yn[ in parallel do DivideAndConquer(Y) Merge [Y0 … Yn[ back into X 

Where they differ is that in quicksort, the division is difficult and merging is trivial (no operation). In merge sort, it’s the other way round: dividing is trivial and merging is difficult.

If you implement the above schema, quicksort is actually easier to parallelise because you can just forget about the merge step. For merge sort, you need to keep track of finished parallel tasks. This screws up the load balancing.

On the other hand, if you follow the above schema, you’ve got a problem: the very first division, and the very last merging, will only use a single processor and all other processors will be idle. Thus it makes sense to parallelise these operations as well. And here we see that parallelising the partitioning step in quicksort is much harder than parallelising the merge step in merge sort.

2 Comments

I'm not entirely convinced with the load balancing. Sure, you need to synchronize before doing the the last merge step, but what the disadvantage to synchronizing after all threads are done with qsort? You do need a way to tell whether the algorithm has terminated after all, and that involves sync too.
@Itax: that involves one synchronisation at the very end of the algorithm. With merge sort, you need to synchronise in every step (or rather, you don’t have discrete steps with a work stealing technique but you still need to synchronise tasks). For details, read my master thesis (madr.at/msc), chapter IX.
3

A merge sort seems like it would be easier to parallelize and distribute...think about it, you're breaking it up into clean sub problems that can easily be divided and distributed. But then again, the same is true of quicksort. However, I would probably prefer doing it with merge sort as it would likely be easier.

Comments

3

Assuming a decent pivot selection, it's not all that different.

Subproblems are trivial to parallelize; they use (mostly) disjoint memory and need no synchronization, so the actual difference lies in the bottlenecks: the initial partition of quick-sort vs. the final merge in merge-sort. Neglecting to parallelize these will result in bad speedups for many cores or few elements (This gets noticeable a lot faster than you might think!).

Both algorithms can be parallelized efficiently. See this MCSTL paper for some experimental results and implementation details. The MCSTL was the base for what is now the GNU C++ std-lib parallel mode.

It's not all clear which algorithm will perform better in all circumstances as it depends on data distribution and about whether swaps or comparisons are slower.

4 Comments

On a side-note: if you're lazy and do not want to parallelize the first/final pass respectively, merge-sort is obviously the winner for more than two threads. You can partition into N chunks directly, whereas quick-sort needs successive partitions to do so.
The answer is top-notch but I don’t agree with the comment. In fact, if you do any non-trivial load balancing then quicksort is much easier than merge sort because you can fire and forget the threads that recursively execute quicksort; for merge sort, you need to manage the parallel tasks that have to be merged back together. Quicksort may need successive partitions but merge sort also uses successive merges. In quicksort you will start with few threads – in merge sort you will end with few threads.
@Konrad: Ah, my bad. I guess I didn't mention that I was assuming a multi-way merge to do the final merge pass, so you don't need to do that successively. Sure you can partition into N chunks too, but that's a whole different algorithm altogether (pretty much like sample-sort). Also, multi-way complexity also depends on the number of sequences, so I guess you got a pretty valid point there. - Good point on the load balancing too - I guess I was assuming that you didn't really have that.
@Itjax True, a multiways merge takes care of this particular problem.
1

I think they are looking for merge-sort as an answer, since it is easy to see how to split this between threads. Though another comment indicates that qsort can also be split into smaller problems. Likely many can be split into smaller problems.

There is one critical aspect that cannot be ignored. Communicating with the other threads takes a lot of time. The data set your are sorting has to be huge, or very expensive to compare, before creating the threads and doing the communication between them will be better than just using a single thread.

Further to this, with any sort, you have a serious problem of false sharing. Having multiple threads work with the same data can (communication time notwithstanding) be slower as the CPU is forced to share and update data between multiple cores. Unless your algorithm can properly align the data, passing it off to various threads will slow it down.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.