What is (int)&data[18] actually doing in this code?

Question

I came across this syntax for reading a BMP file in C++

#include <fstream> int main() { std::ifstream in('filename.bmp', std::ifstream::binary); in.seekg(0, in.end); size = in.tellg(); in.seekg(0); unsigned char * data = new unsigned char[size]; in.read((unsigned char *)data, size); int width = *(int*)&data[18]; // omitted remainder for minimal example }

and I don't understand what the line

int width = *(int*)&data[18];

is actually doing. Why doesn't a simple cast from unsigned char * to int, int width = (int)data[18];, work?

It is taking the memory address of data[18], treating it as a pointer to an integer, then dereferencing it. Basically, treating it as a number. This seems like UB though, since data is only size 1 — ChrisMM
– ChrisMM, Commented Dec 5, 2019 at 0:00
What is *(int)&data[18] actually doing in this code?* Violating the Strict Aliasing Rule, so it could be doing absolutely anything. — user4581301
– user4581301, Commented Dec 5, 2019 at 0:10
@WilliamMiller, unless I'm misreading, data is allocated as an array of 1 unsigned char. I think it should have been new unsigned char[size] — ChrisMM
– ChrisMM, Commented Dec 5, 2019 at 0:11
Yes, but *(int*)&data[18] will also fail on CPUs that require a 32 bit number to be aligned to a 32 bit address (Some CPUs will allow mis-aligned data, but access it much more slowly). Assuming that data is aligned to whatever size data the CPU prefers (usually 32 or 64 bits) data[18] will not be because 18 is not evenly divisible by 4 (32 bits in bytes). It will also fail if the CPU is big endian and the byte order is backwards. — user4581301
– user4581301, Commented Dec 5, 2019 at 0:47

William Miller · Accepted Answer · 2019-12-11 22:59:55Z

Note

As @user4581301 indicated in the comments, this depends on the implementation and will fail in many instances. And as @NathanOliver- Reinstate Monica and @ChrisMM pointed out this is Undefined Behavior and the result is not guaranteed.

According to the bitmap header format, the width of the bitmap in pixels is stored as a signed 32-bit integer beginning at byte offset 18. The syntax

int width = *(int*)&data[18];

reads bytes 19 through 22, inclusive (assuming a 32-bit int) and interprets the result as an integer.

How?

&data[18] gets the address of the unsigned char at index 18
(int*) casts the address from unsigned char* to int* to avoid loss of precision on 64 bit architectures
*(int*) dereferences the address to get the referred int value

So basically, it takes the address of data[18] and reads the bytes at that address as if they were an integer.

Why doesn't a simple cast to `int` work?

sizeof(data[18]) is 1, because unsigned char is one byte (0-255) but sizeof(&data[18]) is 4 if the system is 32-bit and 8 if it is 64-bit, this can be larger (or even smaller for 16-bit systems) but with the exception of 16-bit systems it should be at minimum 4 bytes. Obviously reading more than 4 bytes is not desired in this case, and the cast to (int*) and subsequent dereference to int yields 4 bytes, and indeed the 4 bytes between offsets 18 and 21, inclusive. A simple cast from unsigned char to int will also yield 4 bytes, but only one byte of the information from data. This is illustrated by the following example:

#include <iostream> #include <bitset> int main() { // Populate 18-21 with a recognizable pattern for demonstration std::bitset<8> _bits(std::string("10011010")); unsigned long bits = _bits.to_ulong(); for (int ii = 18; ii < 22; ii ++) { data[ii] = static_cast<unsigned char>(bits); } std::cout << "data[18] -> 1 byte " << std::bitset<32>(data[18]) << std::endl; std::cout << "*(unsigned short*)&data[18] -> 2 bytes " << std::bitset<32>(*(unsigned short*)&data[18]) << std::endl; std::cout << "*(int*)&data[18] -> 4 bytes " << std::bitset<32>(*(int*)&data[18]) << std::endl; }

data[18] -> 1 byte 00000000000000000000000010011010 *(unsigned short*)&data[18] -> 2 bytes 00000000000000001001101010011010 *(int*)&data[18] -> 4 bytes 10011010100110101001101010011010

You should note that this cast is UB. There is no integer there so synthesizing one is illegal.
@NathanOliver-ReinstateMonica Good point, I'm glad others noticed that

Collectives™ on Stack Overflow

What is (int)&data[18] actually doing in this code?

1 Answer 1

Note

How?

Why doesn't a simple cast to `int` work?

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Note

How?

Why doesn't a simple cast to `int` work?

2 Comments

Linked

Related