2

Could someone explain me this code please ? I have received some byte code from an assembler and now I have to use it in my virtual machine. This code is used but I don't know how it works and what it is used for.

static int32_t bytecode_to_int32 (const uint8_t* bytes) { uint32_t result = (uint32_t)bytes[0] << 24 | (uint32_t)bytes[1] << 16 | (uint32_t)bytes[2] << 8 | (uint32_t)bytes[3] << 0 ; return (int32_t)result; } 
6
  • it copies 4 bytes into an uint32_t variable. byte[0] int the most significant bits, byte[1] into next significant, ... byte[3] into least significant. Maybe it's for converting a big endian unsigned integer into the edianness of the local machine Commented Feb 6, 2020 at 14:55
  • I wrote it, and as mentioned then, it swaps byte order from big endian to little endian. Now would be a good time to study endianess, bit shifts, bitwise OR... Commented Feb 6, 2020 at 14:56
  • 1
    It converts four bytes in big-endian order into a native 32-bit value. The bytes need not be properly aligned. The native value might be big-endian or little-endian. Commented Feb 6, 2020 at 14:56
  • 1
    And yeah @JonathanLeffler is correct, strictly speaking it converts from big endian to the endianess of the CPU, whatever that happens to be - portably. This is why bit shifts are superior to any other version - in addition to being very fast, they give CPU independent code. Commented Feb 6, 2020 at 14:59
  • In general, when you see this kind of code and don't understand it, you should at least try providing some sample input, examine the output, and maybe step through in a debugger. You should be able to form some kind of intuition before you have to give up. In this case it may also help to know about endianness. Commented Feb 6, 2020 at 15:09

2 Answers 2

2

It builds up a 32 bit word from 4 bytes. For example if the bytes are : 1st: 0x12 , 2nd: 0x34, 3rd: 0x56, 4th: 0x78 Then:

static int32_t bytecode_to_int32 (const uint8_t* bytes) { uint32_t result = (uint32_t)bytes[0] << 24 | // -> 0x12000000 (uint32_t)bytes[1] << 16 | // -> 0x00340000 (uint32_t)bytes[2] << 8 | // -> 0x00005600 (uint32_t)bytes[3] << 0 ; // -> 0x00000078 return (int32_t)result; // bitwise oring this result -> 0x12345678 } 
Sign up to request clarification or add additional context in comments.

1 Comment

Also notably, the casts to uint32_t are absolutely mandatory or the code may end up shifting values into the sign bit of the bytes[0] operand, which is implicitly promoted by the << operator to signed int.
0

This function attempts to combine the four bytes in a uint8_t[4] into a single uint32_t with big-endian byte order, cast the result into a signed int32_t, and return that.

So, if you pass a pointer to the array { 0xAA, 0xBB, 0xCC, 0xDD } to the function, it will combine them into a 32-bit integer with the most significant bytes of the integer coming from the lowest addresses in the array, giving you 0xAABBCCDD or -1430532899.

However, if the array pointed to by the argument bytes is not at least four bytes long, it has undefined behavior.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.