6

I ran into an error recently where a variable was initialized inside a switch statement. I started playing around with this more and realize I don't know the first thing about what GCC is trying to do in some of these optimizations.

Given this code:

int main(int argc, char** argv) { switch (argc) { case 1000: return 42; int y = 24; default: return y; } return argc; } 

The generated code always returns 42. What is going on? Why does the int y = 24 muck everything up?

$ gcc -Wall -Werror -O2 -c test.c $ objdump -drwCS -Mintel test.o testo.o: file format elf64-x86-64 Disassembly of section .text.startup: 0000000000000000 <main>: 0: b8 2a 00 00 00 mov eax,0x2a 5: c3 ret 
8
  • 6
    Returning y would be undefined behavior as the initialization of that variable can never be reached, hence the compiler is free to assume that only the 1000 case can ever happen. (From the compilers point of view, things which would be undefined behavior can never happen and it optimizes accordingly) Commented Feb 8, 2019 at 6:49
  • I updated my code to include a return statement outside the switch, which still generates the same code. Are you saying that the behavior for the rest of the function is undefined at that point? Commented Feb 8, 2019 at 6:53
  • 3
    The return outside of the switch doesn't matter, you've got a default with a return in the switch. Since this return is undefined behavior, the whole default case can never happen, so argc must be 1000. To make this clear, the compiler doesn't just remove undefined paths, it assumes these paths can never happen when you run the program, thats why the compiler assumes argc can never be anything other than 1000 at runtime. Commented Feb 8, 2019 at 6:55
  • That's really interesting. My intuition is that a switch statement with or without a default will still have its cases evaluated and that a behavior would exist if you can't match a case. Since argc is clearly not equal to 1000 all the time. Commented Feb 8, 2019 at 6:59
  • 3
    But what would it return when argc is not 1000? You don't know, the compiler doesn't know. The line int y = 24; never executes, so you're returning a uninitialized value, which the compiler doesn't like at all. Since it's uninitialized anyway, it can just return 42 as you're not allowed to assume any particular value to be returned. 42 is just as good as 1, 15 or -3546465. Is uninitialized, undefined. Commented Feb 8, 2019 at 7:02

2 Answers 2

4
int main(int argc, char** argv) { switch (argc) { case 1000: return 42; int y = 24; default: return y; } return argc; } 

To explain this a bit more, a switch doesn't exactly do a linear progression. The logic equivalent to this would be:

"If argc is 1000, return 42. Otherwise return y"

The int y = 24; is never used since it's never reached, the compiler can optimize this out, and since there's UB in the case of a default, it might as well return 42.

To fix this and behave the way I suspect you intend, you just need to declare y outside of the switch statement.

int main(int argc, char** argv) { int y = 24; switch (argc) { case 1000: return 42; default: return y; } return argc; } 
Sign up to request clarification or add additional context in comments.

4 Comments

Oh, it was intentional. I was reproducing a bug that was left undiscovered. I understand that it is undefined behavior, but I am just curious about the optimization paths. So, gcc in -O2 is like "I got this. No warnings or errors. I'll just get rid of the code that doesn't make sense." but in -O0 it actually emits a warning about y being potentially uninitialized. For example, here is y being defined (but then a cross-initialization is attempted) godbolt.org/z/P7_80F the logic is to still ditch the default because the cross-initialization makes it undefined?
In any case marking your reply as accepted. It's undefined behavior. Compiler can do what it wants. Thanks for the answer.
I don't think it's a bug that it does this. Compilers will often make decisions like that when compiling
Re “The int y = 24; is never used”: It is used; it defines y. Per C 2018 6.2.4 6, the object y is created (its lifetime begins) when execution enters the block of the switch statement, but initialization is not performed until execution reaches the declaration (hence, in this case, never). So the definition is used in some way (to define y), although its initialization is never executed.
4

Cases in a switch are to be regarded as labels. If we translate your code to the equivalent goto-spaghetti, it might be easier to understand:

int main(int argc, char** argv) { if(argc == 1000) goto label_1000; else goto label_default; label_1000: return 42; int y = 24; label_default: return y; return argc; } 

The goto label_default jumps past the label initialization of y and so it doesn't necessarily get executed. The same thing happens in your switch.

Best practice when declaring variables inside switches is therefore to always use a compound statement per case:

case 1000: { int y = 24; break; } 

Apart from preventing spaghetti bugs, this also reduces the scope of the variable to the specific case.

2 Comments

What about in something like godbolt.org/z/P7_80F ? Is the compiler just confused about the initialization of y at this point?
@JustinVanHorne That's essentially the very same kind of spaghetti as my goto example. The outer y is never used and the return y refers to the inner y, which is never initialized.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.