x64 instruction encoding and the ModRM byte

Question

The encoding of

call qword ptr [rax] call qword ptr [rcx]

is

FF 10 FF 11

I can see where the last digit (0/1) comes from (the register number), but I'm trying to figure out where the second last digit (1) comes from. According to AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System Instructions page 56,

"/digit - Indicates that the ModRM byte specifies only one register or memory (r/m) operand. The digit is specified by the ModRM reg field and is used as an instruction-opcode extension. Valid digit values range from 0 to 7."

The equivalent Intel document says something similar, and call via a register is specified to be encoded as

FF /2

and... I have no idea what that means, or how the 2 in the specification connects to the high 1 digit in the end result. Is there a differently worded explanation available anywhere?

possible duplicate: How to read the Intel Opcode notation

Peter Cordes
– Peter Cordes

2019-05-11 04:32:15 +00:00
Commented May 11, 2019 at 4:32 — Peter Cordes
– Peter Cordes, Commented May 11, 2019 at 4:32
another duplicate: What does the /4 mean in FF /4?

Peter Cordes
– Peter Cordes

2020-08-10 17:21:50 +00:00
Commented Aug 10, 2020 at 17:21 — Peter Cordes
– Peter Cordes, Commented Aug 10, 2020 at 17:21

Alexey Frunze · Accepted Answer · 2013-03-19 22:28:49Z

The ModR/M byte has 3 fields:

bit 7 & bit 6 = mod bit 5 through bit 3 = reg = /digit bit 2 through bit 0 = r/m

This is depicted in Figure 2-1. Intel 64 and IA-32 Architectures Instruction Format of Vol. 2A of Intel® 64 and IA-32 Architectures Software Developer’s Manual.

So, there:

0x10 = 00.010.000 (mod=0, reg/digit=2, r/m=0)

and

0x11 = 00.010.001 (mod=0, reg/digit=2, r/m=1).

nrz · Accepted Answer · 2013-03-21 19:34:20Z

I think you want to check table 2-2 in Intel® 64 and IA-32 Architectures Developer's Manual: Combined Volumes, Volume 2: Instruction Reference Set, Chapter 2: Instruction Format, 2.1.5 Addressing-Mode Encoding of ModR/M and SIB Bytes:

 Table 2-2. 32-Bit Addressing Forms with the ModR/M Byte r8(/r) AL CL DL BL AH CH DH BH r16(/r) AX CX DX BX SP BP SI DI r32(/r) EAX ECX EDX EBX ESP EBP ESI EDI mm(/r) MM0 MM1 MM2 MM3 MM4 MM5 MM6 MM7 xmm(/r) XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 (In decimal) /digit (Opcode) 0 1 2 3 4 5 6 7 (In binary) REG = 000 001 010 011 100 101 110 111 Effective Address Mod R/M Value of ModR/M Byte (in Hexadecimal) [EAX] 00 000 00 08 10 18 20 28 30 38 [ECX] 001 01 09 11 19 21 29 31 39 [EDX] 010 02 0A 12 1A 22 2A 32 3A [EBX] 011 03 0B 13 1B 23 2B 33 3B [--][--] *1 100 04 0C 14 1C 24 2C 34 3C disp32 *2 101 05 0D 15 1D 25 2D 35 3D [ESI] 110 06 0E 16 1E 26 2E 36 3E [EDI] 111 07 0F 17 1F 27 2F 37 3F [EAX]+disp8 *3 01 000 40 48 50 58 60 68 70 78 [ECX]+disp8 001 41 49 51 59 61 69 71 79 [EDX]+disp8 010 42 4A 52 5A 62 6A 72 7A [EBX]+disp8 011 43 4B 53 5B 63 6B 73 7B [--][--]+disp8 100 44 4C 54 5C 64 6C 74 7C [EBP]+disp8 101 45 4D 55 5D 65 6D 75 7D [ESI]+disp8 110 46 4E 56 5E 66 6E 76 7E [EDI]+disp8 111 47 4F 57 5F 67 6F 77 7F [EAX]+disp32 10 000 80 88 90 98 A0 A8 B0 B8 [ECX]+disp32 001 81 89 91 99 A1 A9 B1 B9 [EDX]+disp32 010 82 8A 92 9A A2 AA B2 BA [EBX]+disp32 011 83 8B 93 9B A3 AB B3 BB [--][--]+disp32 100 84 8C 94 9C A4 AC B4 BC [EBP]+disp32 101 85 8D 95 9D A5 AD B5 BD [ESI]+disp32 110 86 8E 96 9E A6 AE B6 BE [EDI]+disp32 111 87 8F 97 9F A7 AF B7 BF EAX/AX/AL/MM0/XMM0 11 000 C0 C8 D0 D8 E0 E8 F0 F8 ECX/CX/CL/MM/XMM1 001 C1 C9 D1 D9 E1 E9 F1 F9 EDX/DX/DL/MM2/XMM2 010 C2 CA D2 DA E2 EA F2 FA EBX/BX/BL/MM3/XMM3 011 C3 CB D3 DB E3 EB F3 FB ESP/SP/AH/MM4/XMM4 100 C4 CC D4 DC E4 EC F4 FC EBP/BP/CH/MM5/XMM5 101 C5 CD D5 DD E5 ED F5 FD ESI/SI/DH/MM6/XMM6 110 C6 CE D6 DE E6 EE F6 FE EDI/DI/BH/MM7/XMM7 111 C7 CF D7 DF E7 EF F7 FF NOTES: 1. The [--][--] nomenclature means a SIB follows the ModR/M byte. 2. The disp32 nomenclature denotes a 32-bit displacement that follows the ModR/M byte (or the SIB byte if one is present) and that is added to the index. 3. The disp8 nomenclature denotes an 8-bit displacement that follows the ModR/M byte (or the SIB byte if one is present) and that is sign-extended and added to the index.

Lance Pollard · Accepted Answer · 2021-01-29 07:27:17Z

The /2 means lookup in Table 2-2 in Volume 2A of the Intel docs (the 2's in table and volume have no relation to the /2 there tho). In that table in the top-left there is /digit. So go over to the column on the right and find the /2. We'll come back to that.

Now, if you look at the call instruction definition, you'll see the Op/En, the "operand encoding".

Op/En Operand 1 Operand 2 Operand 3 Operand 4 D Offset NA NA NA M ModRM:r/m (r) NA NA NA

Also notice the call signatures in the first table, for example, this one, which is 64-bits corresponding to the rax usage:

Opcode Instruction Op/En FF /2 CALL r/m64 M

That M tells us to look up the M in the "operand encoding" (Op/En) table below, which is:

Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (r) NA NA NA

So operand 1 is ModRM:r/m (r). The (r) means that the operand is read (not written to). The ModRM:r/m says the operand has a ModRM byte, with an r/m value. The r in r/m means "register", and the m means "memory".

So going back to the /2 column in table 2-2, we have 010, right on the line that says REG. This is referring to the ModRM middle "reg" segment.

According to this, we have:

mod description (relevant to us) 00 register indirect addressing mode 01 one-byte signed displacement follows addressing mode byte(s) 10 four-byte signed displacement follows addressing mode byte(s) 11 register addressing mode

Since we are using [rax], that is register indirect addressing mode, so 00.

So we have the mod, and the reg, now we need the r/m, to complete the ModRM byte.

From elsewhere on the web: the r/m field encodes which register is used. If we go back to Table 2-2 and to the /2 column, and match it with the Mod 00 box toward the left, and we use the EAX row (which is the same as the RAX used in your call [rax]), we end up at 10. Likewise, if we follow the ECX row (same as RCX in your call [rcx]), we get 11. That gives us:

FF 10 call [rax] FF 11 call [rcx]

Notice the table shows the r/m value too: 000 for rax and 001 for rcx. That gives us the final ModRM byte.

ModRM for hex 00.010.000 rax 10 00.010.001 rcx 11

Notice too that if you do call [eax], it is prefixed with 67 in hex:

67 FF 01

That corresponds to the "address-size override prefix".

It's weird to pick CALL r/m16 as your example entry when the rest of your answer is using call r/m64. But yes, this looks right. Of course you don't actually need a table, just the right register numbers to plug in to the right fields in ModRM (whose layout is documented), as Alexey's answer describes. 32 and 64-bit mode use the same numbers in the r/m (and SIB index) fields as for /r.
Ok made the change. Ah that's interesting, you don't need the table! Mind blown, I thought it was needed. This answer though I think might help beginners start finding their way around.
Given the indirect usage of [rax], we get the mod 00. Given the register, we get 000 for r/m number (that comes from a simple table). Given the /2 we get the 010 because 010 is binary for 2. Combine those to get 00010000 to get the ModRM, which is 10 in hex. Neat, no complicated table necessary.
Yeah, that's why we linked you Q&As like this one a while ago. >.< x86 instruction encoding is complicated (especially with VEX and EVEX prefixes), but ModRM is not bad, just 3 bitfields.

Collectives™ on Stack Overflow

x64 instruction encoding and the ModRM byte

3 Answers 3

Comments

Comments

4 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

4 Comments

Linked

Related