5

Taking an empty program

//demo.c int main(void) { } 

Compiling the program at default optimization.

gcc -S demo.c -o dasm.asm 

I get the assembly output as

//Removed labels and directive which are not relevant main: pushl %ebp // prologue of main movl %esp, %ebp // prologue of main popl %ebp // epilogue of main ret 

Now Compiling the program at -O2 optimization.

gcc -O2 -S demo.c -o dasm.asm 

I get the optimized assembly

main: rep ret 

In my initial search , i found that the optimization flag -fomit-frame-pointer was responsible for removing the prologue and epilogue.

I found more information about the flag , in the gcc compiler manual.But could not understand this reason below , given by the manual , for removing the prologue and epilogue.

Don't keep the frame pointer in a register for functions that don't need one.

Is there any other way , of putting the above reason ?

What is the reason for "rep" instruction , appearing at -02 optimization ?

Why does main function , not require a stack frame initialization ?

If the setting up of the frame pointer , is not done from within the main function , then who does this job ?

Is it done by the OS or is it the functionality of the hardware ?

2
  • 5
    rep ret is a ret with a prefix that doesn't alter the semantics, it keeps some AMD processors happy (some of them have a penalty for jumping directly to a ret). Commented Mar 22, 2013 at 10:05
  • Possible duplicate of Avoiding gcc function prologue overhead? Commented Mar 11, 2016 at 21:08

1 Answer 1

5

Compilers are getting smart, it knew you didn't need a stack frame pointer stored in a register because whatever you put into your main() function didn't use the stack.

As for rep ret:

Here's the principle. The processor tries to fetch the next few instructions to be executed, so that it can start the process of decoding and executing them. It even does this with jump and return instructions, guessing where the program will head next.

What AMD says here is that, if a ret instruction immediately follows a conditional jump instruction, their predictor cannot figure out where the ret instruction is going. The pre-fetching has to stop until the ret actually executes, and only then will it be able to start looking ahead again.

The "rep ret" trick apparently works around the problem, and lets the predictor do its job. The "rep" has no effect on the instruction.

Source: Some forum, google a sentence to find it.

One thing to note is that just because there is no prologue it doesn't mean there is no stack, you can still push and pop with ease it's just that complex stack manipulation will be difficult.

Functions that don't have prologue/epilogue are usually dubbed naked. Hackers like to use them a lot because they don't contaminate the stack when you jmp to them, I must confess I know of no other use to them outside optimization. In Visual Studio it's done via:

__declspec(naked) 
Sign up to request clarification or add additional context in comments.

12 Comments

Now i tried , declaring a variable inside main , and wrote some statements which manipulate it and used printf , now i get a prologue , but no epilogue(no pop instruction)
I would hazard a guess that the main() function simply doesn't need one because its the end of the road. Try creating a function and do printf() there.
just cause you are using the stack doesn't mean you need a stack frame, these days you'll only need a frame if you are using VLA's or _alloca. also, it should be noted that the proper mnemonic in this case is PAUSE, REP is technically a prefix...
pause is rep nop, rep ret is something different.
@BarathBushan The stack adjustment is done by adding/subtracting from the stack pointer, not by moving data to/from memory (that is expensive, and to pop data just to discard it makes no sense).
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.