3
$\begingroup$

This happens to me from time to time: I compile my code with the highest optimization level (-Ofast) of the allegedly fastest compiler (GCC) of one of the fastest languages (C/C++). It takes 3 seconds. I run the compiled program, measuring performance. Then I make some trivial change (say, marking a function inline), compile it again, and it runs 20% faster.

Why? Often I'd rather wait a few minutes or even hours, but be sure that my code is at least hard to optimize further. Why does the compiler give up so quickly?

As far as I know modern architectures are super complicated and hard to a priori optimize for. Couldn't a compiler test many possibilities and see which one is the fastest? I effectively do this by making random changes in the source code, but that doesn't sound optimal.

$\endgroup$
3
  • 1
    $\begingroup$ gcc is open source, so you can download the source code, compile it, modify it and have your question answered that way. $\endgroup$ Commented Jul 25, 2020 at 11:07
  • 2
    $\begingroup$ Not every change that is trivial for you to make is trivial or safe for a compiler to make. E.g. if a function is not inline, the compiler cannot inline it if it may potentially be called by other code, e.g. code that does not yet exist and that will be compiled separately. You may know that no such code will ever exist, while the compiler may have no way of knowing. $\endgroup$ Commented Jan 6 at 15:14
  • 1
    $\begingroup$ @reinierpost Exported functions are allowed to get inlined. $\endgroup$ Commented Jan 6 at 18:31

2 Answers 2

2
$\begingroup$

The problem is that on a modern and highly complex processor, the execution time of code cannot be determined by compiling it, measuring the speed, and assuming that every time this code runs it will run at the exact same speed.

For example, the execution time will depend on the temperature of the processor. If it is hot then the clock speed may be reduced. Or the code is moved to another, cooler CPU which changes your time measurements completely. And takes a highly variable time. Since it is temperature dependent the time can depend on your room temperature. Turn the heating on, your computer slows down.

$\endgroup$
4
  • $\begingroup$ The number of cycles is equally variable, since you have CPUs, memory, various cache levels, hardware, multiple threads, and so on. $\endgroup$ Commented Jan 7 at 7:53
  • $\begingroup$ what about measuring things that are independent of the current state of the hardware, like a basic fact that calling a function (which involves a jump and passing of arguments) is worse than an inlined function. I know inlining is a thing in G++, it even inlines classes sometimes, but as the OP said, it sometimes can do better but doesn't, which is what's strange here. $\endgroup$ Commented Jan 7 at 9:27
  • $\begingroup$ @Coolguy Thats what is commonly done. With modern processors exact execution times are just impossible to predict. You can only produce code that tends to be better. Even a single loop can run in different states with different execution times, entering a different state after returning from an interrupt. $\endgroup$ Commented Jan 7 at 11:43
  • $\begingroup$ there's a misunderstanding; I didn't mean 'execution time' when I referred to 'measuring'. I merely meant 'is this more optimized than that' sort of measurement, a 'compare'. like how a single jump instruction is faster than two jump instructions. comparisons that are deterministic $\endgroup$ Commented Jan 7 at 13:03
2
$\begingroup$

This is a good idea. It's just that nobody has implemented it yet.

Compiler optimization is currently designed as a number of relatively simple passes that run quickly.

What you are describing is a more complicated approach to compiler design that might happen in future compilers.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.