Performance of duplicate computations

Question

I wonder if it's worth to make computation one time and store the result or it's faster to do twice the computation?

For example in this case:

float n1 = a - b; float n2 = a + b; float result = n1 * n2 / (n1 * n2);

Is it better to do:

float result = (a - b) * (a + b) / ((a - b) * (a + b));

? I know that normally we store the result but I wonder if it's not faster to do the addition instead of calling the memory to store/retrieve the value.

The ultimate answer is - it depends. Measure, always measure. — Fureeish
– Fureeish, Commented Feb 17, 2019 at 12:10
It is not a performance issue, it is a code-readability issue. What makes the code easier to read? Whether an additional float or two is declared is negligible. — David C. Rankin
– David C. Rankin, Commented Feb 17, 2019 at 12:10
To make the answer more complicated, but the question less important, many compilers perform common subexpression elimination on source codes. The general rule, when working with a good optimising compiler, is that the programmer should leave such operations to the compiler, which will do it better. — High Performance Mark
– High Performance Mark, Commented Feb 17, 2019 at 12:12
All forms of the calculation In this case either gives the value 1.0f (if abs(a) != abs(b)) or undefined behaviour otherwise. Doing any steps of the calculation at all is therefore quite questionable, let alone a question of whether to do it once or multiple times. A simple float result = 1.0f will do. — Peter
– Peter, Commented Feb 17, 2019 at 12:25

Johannes Overmann · Accepted Answer · 2019-02-17 17:15:32Z

It really depends: For trivial examples like yours, it does not matter. The compiler will generate the same code, since it finds the common sub-expressions and eliminates the duplicated calculations.

For more complicated examples, for example involving function calls, you are better off to use the first variant, to "store" intermediate results. Do not worry about using simple variables for intermediate storage. These are usually all kept in CPU registers, and the compiler is quite good in keeping values in registers.

The danger is that with more complex calculations the compiler may fail to do the common sub-expression elimination. This is for example the case when your code contains function calls which act like a compiler boundary.

Another topic is that with floating point, even simple operations like addition are not associative, i.e. (a+b)+c is different from a+(b+c), due to artifacts in the lowest bits. This often also prevents common subexpression elimination, since the compiler is not allowed to change the semantics of your code.

Great point in distinguishing the floating point arithmetic from other ones in terms of what the compiler may optimise.

Robert Andrzejuk · Accepted Answer · 2019-02-17 13:31:31Z

Dividing the expression into smaller expressions and giving them sensible names gives You several benefits:

It decreases cognitive load.
The longer expression could be now easier to understand and verified correct.
The line of code could be shorter which makes it easier to read and adhere to coding standards.

In C++ a temporary variable could also be marked const, then this also allows the compiler to better optimize the expressions.

But optimizations should be measured before they are discussed and used as arguments. Fast usually comes from the choice of data structures and used algorithms.

In general code should be written to be understood and be correct, and only then should it be optimized.

const float difference = a - b; const float sum = a + b; const float result = difference * sum / (difference * sum);

Yes on the style reasons, but const is irrelevant for optimization, as long you compile with optimization enabled. The compiler will analyze the lifetime of the value of a variable, and see that if sum isn't used after computing result, it doesn't need to keep the value anywhere except a register. If you do disable optimization, of course everything is terrible (Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?)
@PeterCordes Of course! if the used compiler has life time analysis. Whereas const makes it quite explicit what can be done with the variable. (Of course, if the used compiler knows what to do with it)
All mainstream C compilers do such optimizations, usually as part of transforming through an SSA internal representation (e.g. GCC's GIMPLE, or LLVM-IR, or whatever MSVC or ICC use). I can't recall a case where adding const to a local variable has ever improved optimization, although I've mostly looked at output of good compilers (gcc and clang). (Globals are a different matter, of course). Still, this is very basic stuff that I'd expect any ahead-of-time optimizing compiler to do.

Collectives™ on Stack Overflow

Performance of duplicate computations

2 Answers 2

1 Comment

3 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

3 Comments

Linked

Related