Given an instruction address, can the starting address of the function enclosing it be determined?

Question

I've run into this problem in my current project, which requires reasoning about code at the binary level.

I think we can determine the starting location of all functions in a program by looking at the operand to CALL instructions. After we have this list, can we determine which function encloses an address by simply searching backward until we find a start address? IE is the start address of the function enclosing an instruction the greatest function address that is less than the instruction address?

If the above method is not correct, is there another way to find the starting address of the function enclosing an instruction?

edit: Added clarification of the question.

edit2: My method is probably wrong. Compilers are not guaranteed to place function bodies in contiguous regions of machine code.

Assembly language is not even required to use functions. It could just be a big spaghetti mess of gotos. — Raymond Chen
– Raymond Chen, Commented May 29, 2012 at 3:08
You're right in the context of assembly language. The context of this is the output of a compiled language. — lea
– lea, Commented May 29, 2012 at 3:55

Raymond Chen · Accepted Answer · 2012-05-29 12:58:37Z

You need to constrain your problem space more. Even when constrained just to "the output of a compiled language", compilers nowadays are good at blurring the boundaries between functions. Inlining means one function can be enclosed within another. Tail-call optimization transfers control between two functions without a CALL instruction. Profile-guided optimization can create discontiguous functions. Code flow analysis and noreturn hints can result in code falling through to data. Jump tables mean that data can fall through to code without a CALL target. The only reliable way is to have the compiler explicitly tell you the instruction-to-function mapping, say via debug information. You didn't say what platform you're using, so it's hard to give more specific information.

Thanks. Originally, I was hoping to be able to do this for any compiled binary, but it seems like I will have to constrain to binaries compilied from C and use debug information.

Jason Goemaat · Accepted Answer · 2012-05-29 02:24:27Z

0

No, assembly code can do all sorts of funky things. One call might jump completely over another function entirely, jump backwards, or into another module.

answered May 29, 2012 at 2:24

Jason Goemaat

29.3k15 gold badges89 silver badges119 bronze badges

3 Comments

Ira Baxter Over a year ago

In general you can't determine it. Your actual situation will vary with how the function binary code is constructed; for some compilers, rscheme might work. But in general, you can't trust what appears to be instructions, to in fact be instructions. If you can't count on that, you can't possibly trace your way "backwards" to the function start.

lea Over a year ago

Debuggers are able to do this. Should it be possible with debugging information?

Windows programmer Over a year ago

Debugging information ought to include each function's starting address, in order to be useful. So you can start at each function's starting address and go forwards in order to construct lists of instructions that might be exectued by each function. Then, given an address of a particular instruction, you can figure out which function(s) might execute that instruction.

Collectives™ on Stack Overflow

Given an instruction address, can the starting address of the function enclosing it be determined?

2 Answers 2

1 Comment

3 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

3 Comments

Related