2

I usually write my code on Windows, and there are two different types of development environments, each providing their own tools to view the assembly code of an object file(*.obj) or executable (*.exe).

If I am working with Visual Studio build system from command line, the dumpbin /disasm file.obj command can generate disassemble a binary file. A snippet of a disassembly from an executable, produced by dumpbin :

 000000014000E712: 41 81 F0 6E 74 65 xor r8d,6C65746Eh 6C 000000014000E719: 41 81 F1 47 65 6E xor r9d,756E6547h 75 000000014000E720: 44 8B D2 mov r10d,edx 000000014000E723: 8B F0 mov esi,eax 000000014000E725: 33 C9 xor ecx,ecx 000000014000E727: 41 8D 43 01 lea eax,[r11+1] 000000014000E72B: 45 0B C8 or r9d,r8d 000000014000E72E: 0F A2 cpuid 000000014000E730: 41 81 F2 69 6E 65 xor r10d,49656E69h 49 000000014000E737: 89 04 24 mov dword ptr [rsp],eax 

However, if I am working with the GNU toolkit (I mean mingw64, which works with native windows binaries), then running objdump -D file.obj gives a disassembly like this:

 14000e712: 41 81 f0 6e 74 65 6c xor $0x6c65746e,%r8d 14000e719: 41 81 f1 47 65 6e 75 xor $0x756e6547,%r9d 14000e720: 44 8b d2 mov %edx,%r10d 14000e723: 8b f0 mov %eax,%esi 14000e725: 33 c9 xor %ecx,%ecx 14000e727: 41 8d 43 01 lea 0x1(%r11),%eax 14000e72b: 45 0b c8 or %r8d,%r9d 14000e72e: 0f a2 cpuid 14000e730: 41 81 f2 69 6e 65 49 xor $0x49656e69,%r10d 14000e737: 89 04 24 mov %eax,(%rsp) 

Now, it is immediately clear that both are providing the same information. However, I want to know what the numbers on the left column mean (e.g. 14000e712)? Also why is the instruction written differently (e.g. on the first line, dumpbin writes r8d,6C65746Eh, while objdump writes $0x6c65746e,%r8d). Why is this, and what do the different representations mean? Additionally dumpbin seems to write extra information such as dword ptr that objdump doesn't write.

1
  • dumpbin is using what is known as Intel (dis)assembly syntax. By default, objdump, being a GNU utility is using what is known as AT&T (dis)assembly syntax. If you want objdump to display output in Intel syntax, add -Mintel to your objdump command line. Commented May 20, 2022 at 9:57

1 Answer 1

1

Let's break it down. The first and most obvious difference is Intel syntax (dumpbin) vs. AT&T syntax (objdump) for the output you give. That's be the part of your question:

Also why is the instruction written differently (e.g. on the first line, dumpbin writes r8d,6C65746Eh, while objdump writes $0x6c65746e,%r8d). Why is this, and what do the different representations mean?

However, objdump lets you choose between the two and just defaults to AT&T (aka att). Excerpt from the man page:

"intel" "att" Select between intel syntax mode and AT&T syntax mode. 

So you could simply use: objdump -D -M intel ... (also -Mintel) to get way closer to the output from dumpbin.

However, a comparison of the syntax variants can be found on Wikipedia. This dated overview may also help. The most important difference is that Intel syntax places the target first and the source last, whereas with AT&T it's the opposite.

Let's take the instruction you gave:

  • Intel: xor r8d,6C65746Eh
    • xor instruction
    • (first) target operand is r8d (lower 32-bit of the r8 register)
    • (second) source operand is a literal 6C65746Eh (the hexadecimal is denoted via the trailing h here)
  • AT&T: xor $0x6c65746e,%r8d
    • xor instruction
    • (first) source operand is a literal $0x6c65746e (the hexadecimal is denoted via the leading 0x here, IIRC $ is for literals/addresses)
    • (second) target operand is %r8d (lower 32-bit of the r8 register)

NB: This is largely a matter of taste. Binutils (the set of tools around objdump) and others like GDB default to AT&T syntax, but you can tell them to use the Intel syntax. Most of the disassembly I work with is Intel syntax, but it's good to be aware of the two syntax variants and know how they compare.

However, I want to know what the numbers on the left column mean (e.g. 14000e712)?

Those are the addresses. You probably know that executables typically take a different form when mapped into memory than on disk and that address implies two things:

  1. it pretends that the image is mapped at base address 0x140000000
  2. 0x14000e712 is simply an address with offset 0xe712 into the mapped image

Edit 1: Oh and perhaps one word about this mov dword ptr [rsp],eax versus mov %eax,(%rsp) business. I find the Intel syntax more readable, since it doesn't make be think where the syntax can give the clue. "Write DWORD to address pointed to by rsp, fair enough". However, I suppose the reasoning behind the more concise AT&T syntax is that the knowledge about the operation's size (DWORD) can be deduced from the operand (eax) and so it simply leaves out the more or less cosmetic hint of dword ptr.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.