c bit manipulation (endianess)

Question

Could someone explain me this code please ? I have received some byte code from an assembler and now I have to use it in my virtual machine. This code is used but I don't know how it works and what it is used for.

static int32_t bytecode_to_int32 (const uint8_t* bytes) { uint32_t result = (uint32_t)bytes[0] << 24 | (uint32_t)bytes[1] << 16 | (uint32_t)bytes[2] << 8 | (uint32_t)bytes[3] << 0 ; return (int32_t)result; }

it copies 4 bytes into an uint32_t variable. byte[0] int the most significant bits, byte[1] into next significant, ... byte[3] into least significant. Maybe it's for converting a big endian unsigned integer into the edianness of the local machine — Ingo Leonhardt
– Ingo Leonhardt, Commented Feb 6, 2020 at 14:55
I wrote it, and as mentioned then, it swaps byte order from big endian to little endian. Now would be a good time to study endianess, bit shifts, bitwise OR... — Lundin
– Lundin, Commented Feb 6, 2020 at 14:56
It converts four bytes in big-endian order into a native 32-bit value. The bytes need not be properly aligned. The native value might be big-endian or little-endian. — Jonathan Leffler
– Jonathan Leffler, Commented Feb 6, 2020 at 14:56
And yeah @JonathanLeffler is correct, strictly speaking it converts from big endian to the endianess of the CPU, whatever that happens to be - portably. This is why bit shifts are superior to any other version - in addition to being very fast, they give CPU independent code. — Lundin
– Lundin, Commented Feb 6, 2020 at 14:59
In general, when you see this kind of code and don't understand it, you should at least try providing some sample input, examine the output, and maybe step through in a debugger. You should be able to form some kind of intuition before you have to give up. In this case it may also help to know about endianness. — Useless
– Useless, Commented Feb 6, 2020 at 15:09

Jonathan Leffler · Accepted Answer · 2020-02-06 15:02:33Z

It builds up a 32 bit word from 4 bytes. For example if the bytes are : 1st: 0x12 , 2nd: 0x34, 3rd: 0x56, 4th: 0x78 Then:

static int32_t bytecode_to_int32 (const uint8_t* bytes) { uint32_t result = (uint32_t)bytes[0] << 24 | // -> 0x12000000 (uint32_t)bytes[1] << 16 | // -> 0x00340000 (uint32_t)bytes[2] << 8 | // -> 0x00005600 (uint32_t)bytes[3] << 0 ; // -> 0x00000078 return (int32_t)result; // bitwise oring this result -> 0x12345678 }

Also notably, the casts to uint32_t are absolutely mandatory or the code may end up shifting values into the sign bit of the bytes[0] operand, which is implicitly promoted by the << operator to signed int.

Govind Parmar · Accepted Answer · 2020-02-06 14:56:36Z

This function attempts to combine the four bytes in a uint8_t[4] into a single uint32_t with big-endian byte order, cast the result into a signed int32_t, and return that.

So, if you pass a pointer to the array { 0xAA, 0xBB, 0xCC, 0xDD } to the function, it will combine them into a 32-bit integer with the most significant bytes of the integer coming from the lowest addresses in the array, giving you 0xAABBCCDD or -1430532899.

However, if the array pointed to by the argument bytes is not at least four bytes long, it has undefined behavior.

Collectives™ on Stack Overflow

c bit manipulation (endianess)

2 Answers 2

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Linked

Related