-7

I have researched bubble sort speed differences between C and Assembly language, and found that code optimization is one factor.

What other factors are there to consider for bubble sort speed differences between C and Assembly language and which tends to be faster?

7
  • 1
    Extending to the answers given so far (question closed already): C compilers do a really good job in optimising, as denoted already, and you need to be a really good expert in ASM and the specific machine you are writing for to achieve similarly good output. Most humans way not as good as as the compiler, and even if, it's usually just not worth all the effort! There are situations you still can profit from ASM, though. Commented Oct 20, 2023 at 9:41
  • Typical scenario: You profile your application first (!!!) and discover a specific function being a bottleneck in execution time or (depending on what you intend to optimise for) in memory consumption. Then you let the compiler emit ASM code instead of machine code, and on this basis you try to optimise further for this one specific use case the code generated by the compiler optimised for the general, i.e. any arbitrary, use case resulted in not being efficient enough. Again: You need deep background knowledge about machine and algorithm to be able to do so! Commented Oct 20, 2023 at 9:45
  • 1
    @Aconcagua: 100% agreed with the "not worth the effort" point, especially when there's much more to gain from changing the algorithm (to anything other than Bubble Sort). As someone who is capable of noticing missed-optimizations in compiler output, I still wouldn't consider sending in patches to software projects like glibc with hand-written asm versions of most functions that didn't already have one, because it's not worth the maintenance burden. (And might not be faster on some future CPU of the same architecture, especially once compilers improve.) Commented Oct 20, 2023 at 18:23
  • I think the best comment on this page is the most true which is if you know exactly what you are doing it will never be slower than C. Compilers like gcc have been getting worse over time gcc 4.x.x is better than anything that came after (for certain targets). You can almost always find a number of optimizations in the output of the compiler, some of those are of course the compiler is limited to the rules of the C language and asm has no rules, limited only to the logic/hardware rules. Commented Oct 21, 2023 at 6:54
  • 1
    The C version alone can have widely ranging performance values based on the skills of the author and the knowledge of that target and that specific compilers nuances. This is the problem with benchmarks the fantasy that the same (high level language) code runs the same on the same machine or similar machines when the same code can be several times faster or slower on the same machine, much less be used to measure against other machines. Commented Oct 21, 2023 at 7:00

3 Answers 3

3

CPUs run machine code. A C compiler can make machine code for you from C source, vs. with assembly language you're making all the important decisions yourself.


The most important factor is that Bubble Sort is a bad algorithm you should basically never use, especially when you care about performance (Why bubble sort is not efficient?). (Only sometimes if you care about tiny code size, like code golf Sort an Integer List).

Other than that, if you know exactly what you're doing in asm, it will never be slower than C, and you can control micro-optimization choices like the ones that led to GCC making a very slower bubble sort when trying to auto-vectorize. See Bubble sort slower with -O3 than -O2 with GCC for the details of that case.

But if you don't know asm and CPU architecture in that level of detail, you're unlikely to improve on compiler output. There is no "generally" that's true across all CPU (micro-)architectures and all programmer skill-levels.

Related: Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? - another example of the kinds of beginner mistakes that can make your asm even slower than a debug build of a C program.


If you cared about making a sort function run fast, the first thing you'd change would be the algorithm, to Insertion Sort or something. Or a SIMD sorting network, especially if you're writing in non-portable assembly language in the first place.

So this question only makes sense if there's something forcing you to use a bad algorithm like Bubble Sort. Or if you aren't aiming for performance in the first place and are making arbitrary choices of algorithm. Or don't know that Bubble Sort is slow, in which case we should definitely make pessimistic assumptions about programmer skill at assembly language. (Assembly language performance is even more sensitive than most languages to programmer skill at optimizing.)

But if someone forced you to spend time optimizing Bubble Sort, with asm you could probably optimize the case where an element bubbles a long way, maybe avoiding a store/reload as part of a loop-carried dependency chain. But you could probably do the same thing in C with a tmp variable, so IDK whether to count it.

You'd only think of doing that in C if you were aware of the asm / CPU-architecture reasons for it, but often you can get a compiler to generate the efficient asm you want by changing the C. That's usually best because it's still portable C.

Sign up to request clarification or add additional context in comments.

Comments

3

C gets converted to assembly by the compiler and these days the compiler does a very good job, to the point of being faster than a very experienced human, even with infinite time in their hands.

It is not only a matter of which instructions to use but also their order. These days CPUs have multiple internal resources and the compiler knows how to optimize them such that it maximizes throughput. Memory loads and stores are also explicitly optimized.

In that sense, the assembly code that works the fastest for one CPU architecture, might be sluggish for another architecture, even inside the same family (eg Intel). So the C language works as a general template for algorithm correctness and all the hard work of optimization gets done by the compiler underneath. This optimization work is typically better done by a machine.

So overall Id say that apart from very specific cases like very SIMD heavy algorithms in audio/video or cryptography, C (with a modern compiler) is always the winner.

13 Comments

Compilers like gcc have been getting worse over time not better, and to use the word "always", in an decent sized project you can always find the output of the compiler can be optimized or improved. But, yes you have to know what you are doing. And until you do, post questions on a site like this...IMO the reason the compilers are getting worse is because folks are told to stick to high level and avoid learning asm. Each year fewer and fewer able to actually make a functional compiler much less optimizers.
the compiler has no idea about the system, speeds of the memory choosing between extra instructions to build/compute a value vs loading that value from a pool, etc. We have not been processor bound for a while, and the compiler has no clue how to optimize for the system.
Worse? Nonsense. GCC beats clang/LLVM by a large margin nowadays. And saying that gcc 4.7 performs better than gcc 13.0 is just preposterous. The number and quality of passes is exceedingly better than ten years ago.
@old_timer From your statement I can see you have zero idea of all the developments that took place in the past 10 years. Compilers today have a complete model of every sub-architecture even before it is launched. They optimize ALU, memory load/stores, and vector resources before converting from IR into assembly. That's what the '-march' and '-mtune' options are for. I suggest you read more, you need to catch up with compiler technology. Here is a start: llvm.org/devmtg/2014-10/Slides/Estes-MISchedulerTutorial.pdf
Clever humans can always use compiler output as a starting point, unless they see a better strategy entirely. Where compilers do well is inlining across large code-bases, to create asm that would be unmaintainable. So in that sense, yes, compilers can do better than humans practically can for non-tiny amounts of code. But this question was about a single small function with 2 nested loops, Bubble Sort. On that scale, the most skilled humans are more than able to compete with compilers.
|
0

Practically, there really isn't much of a speed difference.

While it is true that Assembly allows you to directly access individual registers on the CPU and that this might allow you to make some small optimisations, humans typically aren't good at seeing such small optimisations.

For that reason, it's usually better (or at least, not slower) to just program in C and allow the compiler to optimise it. Even without very low-level access, C compilers typically can optimise a program better than a human could in Assembly.

3 Comments

Practically, there really isn't much of a speed difference. - That's not true in general. There are certainly counter-examples for other algorithms, where novice mistakes in asm can make things way slower than an optimizing C compiler. Or an asm implementation optimized for code-size can be much slower. It's easy to imagine a naive Bubble Sort in x86 asm that used xchg mem, reg as part of the swap, and thus was much slower than what a compiler would do. (In fact I don't have to image; people have posted such answers on SO.) (xchg mem,reg has an implicit lock prefix on 386 and up.)
Also, if anyone actually cared about Bubble Sort performance and you were forbidden from just implementing Insertion Sort instead (or a SIMD sorting network), with asm you could probably optimize the case where an element bubbles a long way, maybe avoiding a store/reload as part of a loop-carried dependency chain. But you could probably do the same thing in C with a tmp variable, so IDK whether to count it. You'd only think of doing that in C if you were aware of the asm / CPU-architecture reasons for it, though.
I used to write hand-written assembly up to 10 years ago when LLVM took off - and brought GCC with it - into optimizing passes. After some time, I realized my hand written assembly was often slower because of non-optimized instruction pressure and badly placed load/stores. Just not worth it anymore.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.