4

When I grep malloc from the symbol table, with the following command

readelf -s bin | grep malloc 

I can see symbols malloc, __malloc and __libc_malloc share the same code address. I can get the PC address, want to know when a user program calls malloc, but __malloc and __libc_malloc gave me noisy information, any good ways to differentiate malloc out? As I compiled the binary with -static, so dlsym doesn't work in this case.

8
  • @user1147800 so you compiled your code as both C and C++ and got the same output? Commented Mar 3, 2014 at 19:33
  • @crashmstr, compiled with gcc rather than g++ Commented Mar 3, 2014 at 19:36
  • @user1147800 then why tag C++? Commented Mar 3, 2014 at 19:36
  • @crashmstr, it doesn't really matter whether it's g++ or gcc, with malloc/free, the cstdlib is called with g++, which goes to glibc as well. So it's exactly the same regardless of g++ or gcc. I think it's still relevant to g++ and c++ users. Commented Mar 3, 2014 at 19:54
  • @crashmstr, I would appreciate if you could add back the c++ tag, as that's how I believe, if you think otherwise, I would appreciate if you could convince me with reasons, many thanks! Commented Mar 3, 2014 at 19:56

1 Answer 1

7

You're not going to be able to tell them apart unless you use dynamic linking as they will be the same thing, and the act of static linking will replace the name references with the address of the routine.

Take an example:

#include <stdlib.h> extern void *__malloc(size_t); extern void *__libc_malloc(size_t); int main(int argc, char **argv) { void *v = malloc(200); free(v); v = __malloc(200); free(v); v = __libc_malloc(200); free(v); return 0; } 

When compiled using: gcc -static -o example example.c, and then we disassemble the main routine we see:

 40103e: 55 push %rbp 40103f: 48 89 e5 mov %rsp,%rbp 401042: 48 83 ec 20 sub $0x20,%rsp 401046: 89 7d ec mov %edi,-0x14(%rbp) 401049: 48 89 75 e0 mov %rsi,-0x20(%rbp) 40104d: bf c8 00 00 00 mov $0xc8,%edi 401052: e8 19 52 00 00 callq 406270 <__libc_malloc> 401057: 48 89 45 f8 mov %rax,-0x8(%rbp) 40105b: 48 8b 45 f8 mov -0x8(%rbp),%rax 40105f: 48 89 c7 mov %rax,%rdi 401062: e8 09 56 00 00 callq 406670 <__cfree> 401067: bf c8 00 00 00 mov $0xc8,%edi 40106c: e8 ff 51 00 00 callq 406270 <__libc_malloc> 401071: 48 89 45 f8 mov %rax,-0x8(%rbp) 401075: 48 8b 45 f8 mov -0x8(%rbp),%rax 401079: 48 89 c7 mov %rax,%rdi 40107c: e8 ef 55 00 00 callq 406670 <__cfree> 401081: bf c8 00 00 00 mov $0xc8,%edi 401086: e8 e5 51 00 00 callq 406270 <__libc_malloc> 40108b: 48 89 45 f8 mov %rax,-0x8(%rbp) 40108f: 48 8b 45 f8 mov -0x8(%rbp),%rax 401093: 48 89 c7 mov %rax,%rdi 401096: e8 d5 55 00 00 callq 406670 <__cfree> 40109b: b8 00 00 00 00 mov $0x0,%eax 4010a0: c9 leaveq 4010a1: c3 retq 4010a2: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 4010a9: 00 00 00 4010ac: 0f 1f 40 00 nopl 0x0(%rax) 

i.e. the code doesn't differentiate the entries.

Now, if you use dynamic linking; you get a different result. For one thing, __malloc is not available in the resulting binary - this is because the __malloc name is a side-effect of the static linking (there is a way to prevent it from being produced, but the mechanism escapes me at the moment). So when we compile the binary (removing the __malloc call), main looks like:

 40058d: 55 push %rbp 40058e: 48 89 e5 mov %rsp,%rbp 400591: 48 83 ec 20 sub $0x20,%rsp 400595: 89 7d ec mov %edi,-0x14(%rbp) 400598: 48 89 75 e0 mov %rsi,-0x20(%rbp) 40059c: bf c8 00 00 00 mov $0xc8,%edi 4005a1: e8 ea fe ff ff callq 400490 <malloc@plt> 4005a6: 48 89 45 f8 mov %rax,-0x8(%rbp) 4005aa: 48 8b 45 f8 mov -0x8(%rbp),%rax 4005ae: 48 89 c7 mov %rax,%rdi 4005b1: e8 9a fe ff ff callq 400450 <free@plt> 4005b6: bf c8 00 00 00 mov $0xc8,%edi 4005bb: e8 c0 fe ff ff callq 400480 <__libc_malloc@plt> 4005c0: 48 89 45 f8 mov %rax,-0x8(%rbp) 4005c4: 48 8b 45 f8 mov -0x8(%rbp),%rax 4005c8: 48 89 c7 mov %rax,%rdi 4005cb: e8 80 fe ff ff callq 400450 <free@plt> 4005d0: b8 00 00 00 00 mov $0x0,%eax 4005d5: c9 leaveq 4005d6: c3 retq 4005d7: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1) 4005de: 00 00 

So to determine the use of __libc_malloc or malloc, you can check for calls to the plt entry for the routine.

This of course all assumes that you're actually performing some type of static analysis of the binary. If you're doing this at run-time, the usual method is library interception using LD_PRELOAD, which is a whole different question.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.