Cache locality vs Function Calls

Question

I have a function which does a task, lets call this function F(). Now, I need to do this task n times where is sufficiently small. I can think of doing 2 things:

//Code Here... Code-for-function-F() Code-for-function-F() . . . Code-for-function-F() //following code
//Code Here for(int i=0; i<n; ++i) F() //Following code

In the first case, I avoid function call overheads. But since the code is repeated n-times, the code can be rather large and would lead to worse cache locality/performance. For the second case, cache would be better utilized but results in overheads due to function calls. I was wondering if someone has done an analysis on which of the two is better approach.

PS: I understand that actual answer might depend on what code profiling tells me, is there a theoretically better approach between the two? I am using c++ on Linux.

concentrate on writing readable code instead of trying to outsmart your compiler. Algorithmic changes typically outweigh performance impact of calling or not calling functions by far. Anyhow, if you want to know, you have to measure. — 463035818_is_not_an_ai
– 463035818_is_not_an_ai, Commented Jan 20, 2020 at 11:09
btw I dont understand your reasoning. "In the first case, I avoid function call overheads" the function calls are the same in both cases, no? — 463035818_is_not_an_ai
– 463035818_is_not_an_ai, Commented Jan 20, 2020 at 11:11
@formerlyknownas_463035818 What i meant was that in the first case, I inline the function myself rather than wait for the compiler to do it. This way, I end up copying the code which would lead to worse cache locality and I would avoid function overhead. — gundechaHills
– gundechaHills, Commented Jan 20, 2020 at 11:15
@gundechaHills thats not what you have in the code example. Anyhow, also in deciding what to inline and what not, the compiler is more clever than you (and probably 99% of anybody else) — 463035818_is_not_an_ai
– 463035818_is_not_an_ai, Commented Jan 20, 2020 at 11:16

463035818_is_not_an_ai · Accepted Answer · 2020-01-20 12:05:55Z

There is no one-fits-them-all answer when the question is which code is faster. You have to measure it.

However, the optimizations you have in mind, loop-unrolling and function inlining, are techniques that the compiler is really good at. It is rare that applying them explicitly in your code helps the compiler to perform better optimizations. I would rather worry about preventing such compiler optimizations by writing unnecessarily clever code.

If you have a concrete example, I suggest you to take a look at godbolt. It is a nice tool that can help you to see the effect of variations on the code on the output of the compiler.

Also dont forget the famous quote from D.Knuth:

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

Often it is cited incomplete, while the last part is as important as the rest: "Yet we should not pass up our opportunities in that critical 3%.". To know where those 3% are you have to profile your code.

TL;DR: Don't do premature optimizations. Measure and profile first, only then you know where it is worth to improve and if you can get an improvement at all.

Collectives™ on Stack Overflow

Cache locality vs Function Calls

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related