2

Here is my test code:

#include<stdio.h> static inline void foo(int a){ printf("%x\n", a); } int main(void){ foo(0x1234); return 0; } 

I thought GCC should realize that a is a literal integer, and optimize to code like this:

puts("1234"); 

But I got the following assembly code:

│0x8048341 <main+17> push $0x1234 │0x8048346 <main+22> push $0x80484e0 │0x804834b <main+27> push $0x1 │0x804834d <main+29> call 0x8048310 <__printf_chk@plt> 

There exists a lot of such code in my project, because I always believed that GCC would optimize for me, and even in some context where could simply use 'write()', I insisted using printf, because I thought I would gain benefit from its buffer mechanism.

Now I feel regret, for the overhead of paring a format string will kill any gain I have. These codes in my project are quite low-level, and they might cause the performance bottleneck.

20
  • 3
    Relying on compiler to magically optimize things for you is usually a hopeless practice. Commented Jun 7, 2017 at 8:00
  • 3
    I don't think a compiler is ever allowed to optimize a library call away. You could link a library where printf() has side-effects the compiler can't know about at the compile stage. Commented Jun 7, 2017 at 8:02
  • 6
    @FelixPalmen No, the compiler can because printf() is define by C standard. Commented Jun 7, 2017 at 8:03
  • 2
    @Stargateur Actually, passing an int for %x may be well-defined, as va_arg allows mixing signedness (7.15.1.1 in C99). GCC and Clang seem to agree with this interpretation, as they don't warn even with -Wformat=2. Commented Jun 7, 2017 at 8:16
  • 3
    @M.M GCC still needs to figure the correct string to output based on the format. It's surely possible to implement this optimisation but maybe it was regarded as not worth the troubles of maintaining a duplicate of printf. Maybe in a couple of years we'll get it? :) Commented Jun 7, 2017 at 8:23

2 Answers 2

6

These codes in my project are quite low-level, and they might cause the performance bottleneck.

First off, I can assuage your fears that this is not possible. The overhead of console I/O is tremendous (relatively speaking), so that will always be the bottleneck in your code, no matter what means you use to do it.

I thought gcc should realize that a is a literal integer, and optimize to code like this:

puts("1234"); 

Clearly it doesn't. GCC (and Clang) does perform an optimization where printf("...\n"); is transformed into puts("...");, as you can see here, but this only happens when you use string literals with printf. The optimizer doesn't (currently) peek into the format string, parse it, and optimize around it. You called printf, so you get printf.

Compiler optimizations aren't guaranteed, so you shouldn't write code that relies on them without first verifying that the desired optimizations are in fact being applied under all circumstances in which you are interested (including code variations, compiler versions, target platforms, etc.).

If you'd like to suggest this as a possible improvement for the GCC optimizer, you can suggest an enhancement on their Bugzilla. But don't hold your breath on it being implemented any time soon. The logic required to implement this type of optimization is not really worth the effort, considering the real-world performance improvements that could be expected are minimal at best (see above).

In the meantime, if you absolutely require this optimization with minimal changes to your code, then you can use some macro hackery:

#define STRINGIFY_INTERNAL(x) #x #define STRINGIFY(x) STRINGIFY_INTERNAL(x) #define foo(a) puts(STRINGIFY(a)) 

This does produce the desired output:

.LC0: .string "0x1234" MyFunction: sub esp, 24 push OFFSET FLAT:.LC0 call puts xor eax, eax add esp, 28 ret 
Sign up to request clarification or add additional context in comments.

6 Comments

It turns out that clang will optimize const char fmt[] = = {'h', 'i', '\n', 0}; / printf(fmt); into a puts. godbolt.org/g/aTQRKX But for gcc, it does literally have to be a string literal. (At least indirectly; const char*fmt = "hi\n"; and then passing that to printf optimizes, as long as there are no other args to printf.)
Interesting observation, @Peter. I had not considered that variation, and I'm honestly kind of surprised that GCC's pattern-matching optimization does not work there.
I was surprised too, but it makes some sense. I guess it doesn't look at char-arrays or their initializers as string constants, since it would have to check that they contain only a single 0 byte at the end. You can kind of see this in the asm output where gcc uses separate .byte directives instead of a .asciz, but clang notices the string-like nature and emits .asciz as well as doing the puts optimization.
Hmm, I wonder if there are any real-life missed-optimizations with strlen() or strchr() on char arrays instead of strings? This printf/puts one seems pretty artificial for a char array.
Gcc 12.1 optimises to puts too, Peter. In fact it happens since 9.1.
|
2

A conforming library implementation could include functions beyond those defined by the Standard, which would change the way standard functions behave. For example, a library might include a __select_alternate_digits function that, when invoked, would cause succeeding calls to printf to display numbers using something other than the normal digits.

With such a library, given the code:

#include <stdio.h> // Could legitimately include functions that aren't // defined by the Standard, but which start with __. int main(void) { __select_alternate_digits("⁰¹²³⁴⁵⁶⁷⁸⁹"); printf("%d",123); __select_alternate_digits(0); // Reset to default set } 

the call to __select_alternate_digits might cause the above program to output "¹²³" rather than "123". If the compiler bundled its own printf function, it could know that its behavior wasn't going to be affected by any other function calls. If it's using an outside library, however, then unless the program is completely free of any calls to functions the compiler knows nothing about, a quality compiler should assume those functions might have effects the compiler can't predict.

7 Comments

This doesn't really explain anything: printf("123\n"); is replaced by a call to puts, while the OP's code isn't, despite the original function being one and the same: printf, so the compiler does know something about it.
@Ruslan: It would be unusual for implementations of printf to do any kind of translation with strings that don't contain %. It think gcc is perhaps being a bit presumptuous in assuming that they won't, but the further it goes into the realm of non-trivial format specifiers, the greater the likelihood of trouble. I did, however, forget another point: one of the uses of code containing printf with non-trivial format strings is to test printf implementations. If a programmer writes printf("Hello"); it's likely the programmer is more interested in outputting Hello\n than...
...in testing whether printf can handle such strings correctly. But given printf("%d\n",-2147483647-1); a programmer may not be interested in outputting -2147483648\n but rather in confirming that a particular printf implementation handles the INT_MIN case correctly. Having the compiler do the formatting itself would undermine that goal.
re: testing printf or memcpy or whatever implementations: use gcc -fno-builtin-printf.
glibc printf does allow customizations by registering new conversions, or even overriding standard conversion specifiers. (gnu.org/software/libc/manual/html_node/Customizing-Printf.html). So in a GNU system, it's only safe to transform printf calls when the format string contains no conversions, because stuff that runs before main could have changed the behaviour for %x.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.