176

Say I have this small function in a source file

static void foo() {} 

and I build an optimized version of my binary yet I don't want this function inlined (for optimization purposes). is there a macro I can add in a source code to prevent the inlining?

3

10 Answers 10

204

You want the gcc-specific noinline attribute.

This function attribute prevents a function from being considered for inlining. If the function does not have side-effects, there are optimizations other than inlining that causes function calls to be optimized away, although the function call is live. To keep such calls from being optimized away, put asm ("");

Use it like this:

void __attribute__ ((noinline)) foo() { ... } 
Sign up to request clarification or add additional context in comments.

9 Comments

Using gcc 4.4.3 on Arch Linux, I get a syntax error with the attribute placed as above. It works correctly when it precedes the function (e.g., attribute ((noinline)) void foo() {})
Arduino also wanted it placed before the function.
Edited to fix the attribute syntax.
The asm("") construct is actually fairly cross-platform and got the job done. I did it for x86 Linux and it did not cause a build problem on PowerPC AIX. Thanks for this useful suggestion!
The approach which requires code changes everywhere cannot be reasonably considered an acceptable answer.
|
47

GCC has a switch called

-fno-inline-small-functions

So use that when invoking gcc. But the side effect is that all other small functions are also non-inlined.

4 Comments

Didn't work at compiler level. Was using gcc 5.2.1 20150902 (Red Hat 5.2.1-2)
Either current GCC 6.4 is broken, or this and simpler -fno-inline do not work at all. gdb still enters methods on step-over. Something is broken, and I doubt it is gdb.
It will turn off inline optimization for all, not only for a specified function.
@ajeh Not inlining functions means that they are called normally, doesn’t it?
31

I know the question is about GCC, but I thought it might be useful to have some information about compilers other compilers as well.

GCC's noinline function attribute is pretty popular with other compilers as well. It is supported by at least:

  • Clang (check with __has_attribute(noinline))
  • Intel C/C++ Compiler (their documentation is terrible, but I'm certain it works on 16.0+)
  • Oracle Solaris Studio back to at least 12.2
  • ARM C/C++ Compiler back to at least 4.1
  • IBM XL C/C++ back to at least 10.1
  • TI 8.0+ (or 7.3+ with --gcc, which will define __TI_GNU_ATTRIBUTE_SUPPORT__)

Additionally, MSVC supports __declspec(noinline) back to Visual Studio 7.1. Intel probably supports it too (they try to be compatible with both GCC and MSVC), but I haven't bothered to verify that. The syntax is basically the same:

__declspec(noinline) static void foo(void) { } 

PGI 10.2+ (and probably older) supports a noinline pragma which applies to the next function:

#pragma noinline static void foo(void) { } 

TI 6.0+ supports a FUNC_CANNOT_INLINE pragma which (annoyingly) works differently in C and C++. In C++, it's similar to PGI's:

#pragma FUNC_CANNOT_INLINE; static void foo(void) { } 

In C, however, the function name is required:

#pragma FUNC_CANNOT_INLINE(foo); static void foo(void) { } 

Cray 6.4+ (and possibly earlier) takes a similar approach, requiring the function name:

#pragma _CRI inline_never foo static void foo(void) { } 

Oracle Developer Studio also supports a pragma which takes the function name, going back to at least Forte Developer 6, but note that it needs to come after the declaration, even in recent versions:

static void foo(void); #pragma no_inline(foo) 

Depending on how dedicated you are, you could create a macro that would work everywhere, but you would need to have the function name as well as the declaration as arguments.

If, OTOH, you're okay with something that just works for most people, you can get away with something which is a little more aesthetically pleasing and doesn't require repeating yourself. That's the approach I've taken for Hedley, where the current version of HEDLEY_NEVER_INLINE looks like:

#if \ HEDLEY_GNUC_HAS_ATTRIBUTE(noinline,4,0,0) || \ HEDLEY_INTEL_VERSION_CHECK(16,0,0) || \ HEDLEY_SUNPRO_VERSION_CHECK(5,11,0) || \ HEDLEY_ARM_VERSION_CHECK(4,1,0) || \ HEDLEY_IBM_VERSION_CHECK(10,1,0) || \ HEDLEY_TI_VERSION_CHECK(8,0,0) || \ (HEDLEY_TI_VERSION_CHECK(7,3,0) && defined(__TI_GNU_ATTRIBUTE_SUPPORT__)) # define HEDLEY_NEVER_INLINE __attribute__((__noinline__)) #elif HEDLEY_MSVC_VERSION_CHECK(13,10,0) # define HEDLEY_NEVER_INLINE __declspec(noinline) #elif HEDLEY_PGI_VERSION_CHECK(10,2,0) # define HEDLEY_NEVER_INLINE _Pragma("noinline") #elif HEDLEY_TI_VERSION_CHECK(6,0,0) # define HEDLEY_NEVER_INLINE _Pragma("FUNC_CANNOT_INLINE;") #else # define HEDLEY_NEVER_INLINE HEDLEY_INLINE #endif 

If you don't want to use Hedley (it's a single public domain / CC0 header) you can convert the version checking macros without too much effort, but more than I'm willing to put in ☺.

2 Comments

Thanks for the link to your project @nemequ. I've asked our other developers to evaluate it for our use. We have diverse architectures.
I'd be very interested to know what they say, especially if they're not interested. And, of course, I'm around to answer questions (GitHub issue tracker, e-mail, whatever…).
21

A portable way to do this is to call the function through a pointer:

void (*foo_ptr)() = foo; foo_ptr(); 

Though this produces different instructions to branch, which may not be your goal. Which brings up a good point: what is your goal here?

3 Comments

If the pointer is defined at file scope, and not static, it should work since the compiler then can't assume it has its initial value at time of use. If it's a local (as shown) it's almost certainly treated the same as foo(). ("In this decade", he added, looking at the dates)
This will not work on modern compilers (e.g. gcc 11, clang 12): they will see right through this and inline the function anyway: godbolt.org/z/hh8dcnE3v Adding volatile makes it work again.
downvoted since as previous comment says: this is not reliable in all cases.
15

In case you get a compiler error for __attribute__((noinline)), you can just try:

noinline int func(int arg) { .... } 

Comments

12
static __attribute__ ((noinline)) void foo() { } 

This is what worked for me.

Comments

9

Use the noinline attribute:

int func(int arg) __attribute__((noinline)) { } 

You should probably use it both when you declare the function for external use and when you write the function.

Comments

2

I work with gcc 7.2. I specifically needed a function to be non-inlined, because it had to be instantiated in a library. I tried the __attribute__((noinline)) answer, as well as the asm("") answer. Neither one solved the problem.

Finally, I figured that defining a static variable inside the function will force the compiler to allocate space for it in the static variable block, and to issue an initialization for it when the function is first called.

This is sort of a dirty trick, but it works.

1 Comment

You could define your function inline void foo(void) { ... } in a header and declare it extern inline void foo(void); in a library source file. Following C99 semantics, the compiler would be allowed to inline the function when it pleases AND emit object code in your library. See Is "inline" without "static" or "extern" ever useful in C99 ?.
1

I couldn't get __attribute__((noinline)) to work, but this works on clang and GCC.

The Linux kernel defines noinline.

include/linux/compiler_attributes.h:#define noinline __attribute__((__noinline__)) 
#include <linux/kernel.h> static noinline void foo(void); 

1 Comment

In code where noinline is a macro, __attribute__((noinline)) would expand the macro definition, so you'd get __attribute__((__attribute__((__noinline__)))) which isn't valid. All GCC attributes have double-underscored version like __noinline__ to avoid this kind of problem. Additionally, some keywords are only recognized in their __name__ form in ISO instead of GNU mode (like -std=c23 instead of gnu23), such as __asm__ vs. asm, to avoid namespace pollution. But noinline is still recognized as an attribute in C23 mode.
0

__attribute__((noipa)) fully blocks all inter-procedural analysis

If you just want to look at the asm for a call site, the easiest thing is not to define the function in the same compilation unit, only declare it with a prototype. (That won't work if you're using -flto to allow cross-file inlining with link-time optimization, which you should for production builds, especially if you have small functions in separate .c files, not .h.)


__attribute__((noinline)) is sufficient for most use-cases, especially ones where you're actually planning to run the generated code and just want to stop the compiler from bloating your binary.

But sometimes you also want to look at the asm for a stand-alone definition of the function so you want a way to get both definition and call-site into one compile.

To fully block all inter-procedural analysis in GCC, use __attribute__((noipa)). (also or instead)

__attribute__((noinline)) void empty_noinline(){} int foo(){ empty_noinline(); return 1; } 

GCC and Clang (with -O1 or higher) both compile foo without a call to empty_noinline since it does nothing. They aren't inlining it, they just see there are no side effects and optimize away the call itself.

# x86-64 GCC -O3 empty_noinline: ret foo: mov eax, 1 ret 

But with noipa, GCC acts like the definition wasn't visible, only a declaration. (Clang doesn't have an equivalent attribute that I know of, but ignores attributes it doesn't understand. I use noinline as well as noipa so at least noinline is there for Clang, for functions where that helps at all.)

__attribute__((noinline,noipa)) void empty_noipa(){} int bar(){ empty_noipa(); return 2; } 
empty_noipa: ret bar: sub rsp, 8 # align stack pointer by 16 before a call call empty_noipa #### Call not optimized away mov eax, 2 add rsp, 8 ret 

See these examples on Godbolt with GCC15.2 and Clang 20.1 where you can play with them.


Inter-procedural optimization other than inlining or optimizing away

Putting some work into empty_noinline() will convince GCC and Clang to actually call it, like volatile int a=1; a*=2;.

But without noipa, GCC will omit aligning the stack to fully follow the ABI, since inter-procedural analysis finds that the function doesn't care about stack alignment by 16. GCC will also potentially make .constprop clones of functions, specialized for one arg being a compile-time constant. noipa blocks that, too.

__attribute__((noinline)) void empty_noinline(){ volatile // no side effects unless a is volatile int a=1; a*=2; } int foo(){ empty_noinline(); return 1; } 

(See the above Godbolt link; just uncomment volatile in empty_noinline.)

# x86-64 GCC -O3 empty_noinline: mov DWORD PTR [rsp-4], 1 mov eax, DWORD PTR [rsp-4] # GCC won't use memory-source ALU ops with volatile add eax, eax mov DWORD PTR [rsp-4], eax ret foo: # note lack of sub/add RSP, or dummy push like Clang does # so RSP%16 == 8 at this point call empty_noinline mov eax, 1 ret 

Clang 20 still fully follows the calling convention:

empty_noinline: mov dword ptr [rsp - 4], 1 shl dword ptr [rsp - 4] # volatile doesn't force clang to load/store with separate insns ret foo: push rax # align the stack with a dummy push call empty_noinline mov eax, 1 pop rcx # and undo it into a different call-clobbered reg ret 

The above GCC optimizations aren't necessarily bad depending on what you wanted. If you want an example of how to write correct asm by hand which follows the calling conventions, then any IPA is bad. But if you want efficient compact code, then opportunistically skipping stack alignment is a good thing when it doesn't hurt the callee.


Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.