4

I have this problem, I am recursively calling a function in C and C is lexically scoped, so I can only access the current stack frame. I want to extract the arguments and the local variables from the previous stack frame which was created under the previous function call while im on the current stack frame

I know that the values from the previous recursive call are still on the stack, but I cant access access these values because they're "buried" under the active stack frame?

I want to extract the arguments and local variables from the previous stack and copy them to copy_of_buried_arg and copy_of_buried_loc;

It is a requirement to use inline assembly using GAS to extract the variables, this is what I have so far, and I tried all day, I cant seem to figure it out, I drew the stack on paper and did the calculations but nothing is working, I also tried deleting calls to printf so the stack will be cleaner but I cant figure out the right arithmetic. Here is the code so far, my function halts on the second iteration

#include <stdio.h> char glo = 97; // just for fun 97 is ascii lowercase 'a' int copy_of_buried_arg; char copy_of_buried_loc; void rec(int arg) { char loc; loc = glo + arg * 2; // just for fun, some char arithmetic printf("inside rec() arg=%d loc='%c'\n", arg, loc); if (arg != 0) { // after this assembly code runs, the copy_of_buried_arg and // copy_of_buried_loc variables will have arg, loc values from // the frame of the previous call to rec(). __asm__("\n\ movl 28(%esp), %eax #moving stack pointer to old ebp (pointing it to old ebp)\n\ addl $8, %eax #now eax points to the first argument for the old ebp \n\ movl (%eax), %ecx #copy the value inside eax to ecx\n\ movl %ecx, copy_of_buried_arg # copies the old argument\n\ \n\ "); printf("copy_of_buried_arg=%u copy_of_buried_loc='%c'\n", copy_of_buried_arg, copy_of_buried_loc); } else { printf("there is no buried stack frame\n");// runs if argument = 0 so only the first time } if (arg < 10) { rec(arg + 1); } } int main (int argc, char **argv) { rec(0); return 0; } 
10
  • 3
    Can't you just pass the address of the previous invocation's locals to the next function? Messing with assembly language can lead to no good things here. Commented May 2, 2012 at 1:48
  • How about running that in your debugger and getting the numbers from there? Commented May 2, 2012 at 1:48
  • @GregHewgill we are supposed to use assembly to dig the variables from the previous stack, its a requirement Commented May 2, 2012 at 1:51
  • 1
    I see. It would be useful to point out that sort of thing in the question. Commented May 2, 2012 at 1:52
  • @Alex how do I go about doing that. actually I had this question too, is there a program that graphically shows the stack? Commented May 2, 2012 at 1:53

2 Answers 2

1

I can try to help, but don't have Linux or assembly in GAS. But the calculations should be similar:

Here's the stack after a couple of calls. A typical stack frame setup creates a linked list of stack frames, where EBP is the current stack frame and points to its old value for the previous stack frame.

 +-------+ ESP-> |loc='c'| <- ESP currently points here. +-------+ EBP-> |oldEBP |--+ <- rec(0)'s call frame +-------+ | |retaddr| | <- return value of rec(1) +-------+ | |arg=1 | | <- pushed argument of rec(1) +-------+ | |loc='a'| | <- local variable of rec(0) +-------+ | +--|oldEBP |<-+ <- main's call frame | +-------+ | |retaddr| <- return value of rec(0) | +-------+ | |arg=0 | <- pushed argument of rec(0) | +-------+ \|/ to main's call frame 

This is created by the following sequence:

  1. Push arguments last arg first.
  2. Call the function, pushing a return address.
  3. Push soon-to-be old EBP, preserving previous stack frame.
  4. Move ESP (top of stack, containing oldEBP) into EBP, creating new stack frame.
  5. Subtract space for local variables.

This has the effect on a 32-bit stack that EBP+8 will always be the first parameter of the call, EBP+12 the 2nd parameter, etc. EBP-n is always an offset to a local variable.

The code to get the previous loc and arg is then (in MASM):

mov ecx,[ebp] // get previous stack frame mov edx,[ecx]+8 // get first argument mov copy_of_buried_arg,edx // save it mov dl,[ecx]-1 // get first char-sized local variable. mov copy_of_buried_loc,dl // save it 

or my best guess in GAS (I don't know it but know it is backwards to MASM):

movl (%ebp),ecx movl 8(%ecx),edx movl edx,copy_of_buried_arg movb -1(%ecx),dl movb dl,copy_of_buried_loc 

Output of your code with my MASM using VS2010 on Windows:

inside rec() arg=0 loc='a' there is no buried stack frame inside rec() arg=1 loc='c' copy_of_buried_arg=0 copy_of_buried_loc='a' inside rec() arg=2 loc='e' copy_of_buried_arg=1 copy_of_buried_loc='c' inside rec() arg=3 loc='g' copy_of_buried_arg=2 copy_of_buried_loc='e' inside rec() arg=4 loc='i' copy_of_buried_arg=3 copy_of_buried_loc='g' inside rec() arg=5 loc='k' copy_of_buried_arg=4 copy_of_buried_loc='i' inside rec() arg=6 loc='m' copy_of_buried_arg=5 copy_of_buried_loc='k' inside rec() arg=7 loc='o' copy_of_buried_arg=6 copy_of_buried_loc='m' inside rec() arg=8 loc='q' copy_of_buried_arg=7 copy_of_buried_loc='o' inside rec() arg=9 loc='s' copy_of_buried_arg=8 copy_of_buried_loc='q' inside rec() arg=10 loc='u' copy_of_buried_arg=9 copy_of_buried_loc='s' 
Sign up to request clarification or add additional context in comments.

3 Comments

My guess would be that any sort of stack protection would render this approach completely non-portable.
@tbert: What does stack protection have to do with this? The OP's code is unportable because it will break when compiled with different optimization options or a different version of the compiler.
I am light, I'm just having nightmares of a flood of questions about this from people who think it's a viable technique to pass information to further recursive iterations of a function. If the instructor isn't making sure there are a long list of caveats that go along with this hack, then I'll make damn sure somewhere on the internet has somebody saying "caveat emptor".
0

With my compiler (gcc 3.3.4) I ended up with this:

#include <stdio.h> char glo = 97; // just for fun 97 is ascii lowercase 'a' int copy_of_buried_arg; char copy_of_buried_loc; void rec(int arg) { char loc; loc = glo + arg * 2; // just for fun, some char arithmetic printf("inside rec() arg=%d loc='%c'\n", arg, loc); if (arg != 0) { // after this assembly code runs, the copy_of_buried_arg and // copy_of_buried_loc variables will have arg, loc values from // the frame of the previous call to rec(). __asm__ __volatile__ ( "movl 40(%%ebp), %%eax #\n" "movl %%eax, %0 #\n" "movb 31(%%ebp), %%al #\n" "movb %%al, %1 #\n" : "=m" (copy_of_buried_arg), "=m" (copy_of_buried_loc) : : "eax" ); printf("copy_of_buried_arg=%u copy_of_buried_loc='%c'\n", copy_of_buried_arg, copy_of_buried_loc); } else { printf("there is no buried stack frame\n");// runs if argument = 0 so only the first time } if (arg < 10) { rec(arg + 1); } } int main (int argc, char **argv) { rec(0); return 0; } 

Here's the disassembly of the relevant part (get it with gcc file.c -S -o file.s):

_rec: pushl %ebp movl %esp, %ebp subl $8, %esp movl 8(%ebp), %eax addl %eax, %eax addb _glo, %al movb %al, -1(%ebp) subl $4, %esp movsbl -1(%ebp),%eax pushl %eax pushl 8(%ebp) pushl $LC0 call _printf addl $16, %esp cmpl $0, 8(%ebp) je L2 /APP movl 40(%ebp), %eax # movl %eax, _copy_of_buried_arg # movb 31(%ebp), %al # movb %al, _copy_of_buried_loc # /NO_APP subl $4, %esp movsbl _copy_of_buried_loc,%eax pushl %eax pushl _copy_of_buried_arg pushl $LC1 call _printf addl $16, %esp jmp L3 L2: subl $12, %esp pushl $LC2 call _printf addl $16, %esp L3: cmpl $9, 8(%ebp) jg L1 subl $12, %esp movl 8(%ebp), %eax incl %eax pushl %eax call _rec addl $16, %esp L1: leave ret 

Those offsets from ebp (40 and 31) initially were set to an arbitrary guess value (e.g. 0) and then refined through observation of the disassembly and some simple calculations.

Note that the function uses extra 12+4=16 bytes of stack for the alignment and the parameter when it calls itself recursively:

 subl $12, %esp movl 8(%ebp), %eax incl %eax pushl %eax call _rec addl $16, %esp 

There are also 4 bytes of the return address.

And then the function uses 4+8=12 bytes for the old ebp and its local variables:

_rec: pushl %ebp movl %esp, %ebp subl $8, %esp 

So, in total the stack grows by 16+4+12=32 bytes with each recursive call.

Now, we know how to get our local arg and loc through ebp:

 movl 8(%ebp), %eax ; <- arg addl %eax, %eax addb _glo, %al movb %al, -1(%ebp) ; <- loc 

So, we just add 32 to those offsets 8 and -1 and arrive at 40 and 31.

Do the same and you'll get your "buried" variables.

2 Comments

Note that on at least x86-64 arguments actually arrives in registers (mostly), not on the stack, so this is not really possible.
@perh The question didn't mention 64-bit. Also, if registers aren't sufficient for keeping everything in them, something will inevitably be stored on the stack.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.