26

I am trying to convert 4 bytes to an integer using C++.

This is my code:

int buffToInteger(char * buffer) { int a = (int)(buffer[0] << 24 | buffer[1] << 16 | buffer[2] << 8 | buffer[3]); return a; } 

The code above works in almost all cases, for example: When my buffer is: "[\x00, \x00, \x40, \x00]" the code will return 16384 as expected.

But when the buffer is filled with: "[\x00, \x00, \x3e, \xe3]", the code won't work as expected and will return "ffffffe1".

Does anyone know why this happens?

4
  • 4
    ".. the code won't work as expected": Your code does work as expected (because I expect this behavior with signed byte values). Commented Jan 22, 2016 at 10:01
  • @Jongware Why it returns ffffffe1 and not 16102? Can you please explain? Commented Jan 22, 2016 at 10:03
  • @i486 the problem not in endians. Commented Jan 22, 2016 at 10:09
  • @TonyD: that's not right. (But you're looking in the right neighborhood.) Commented Jan 22, 2016 at 10:12

6 Answers 6

36

Your buffer contains signed characters. So, actually, buffer[0] == -29, which upon conversion to int gets sign-extended to 0xffffffe3, and in turn (0x3e << 8) | 0xffffffe3 == 0xffffffe3.

You need ensure your individual buffer bytes are interpreted unsigned, either by declaring buffer as unsigned char *, or by explicitly casting:

int a = int((unsigned char)(buffer[0]) << 24 | (unsigned char)(buffer[1]) << 16 | (unsigned char)(buffer[2]) << 8 | (unsigned char)(buffer[3])); 
Sign up to request clarification or add additional context in comments.

4 Comments

Isn't this undefined behavior? As far as I understand, the result type of (unsigned char)(buffer[0]) << 24 is int, and with some values of buffer[0], the operation can overflow the range of 32-bit int.
@khuttun there is a special case just for that: if the type is signed and the left operand is positive, and the result can be represented by the unsigned version of the type, then the behavior is defined. The final result will be the conversion to a signed number of the result.
Thanks @spectras, you are correct: en.cppreference.com/w/cpp/language/…
This is a C++ question, but it doesn't look like that special case exists in C. From the latest C23 draft: "6.5.7 Bitwise shift operators [...] The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. [...] The result of E1 << E2 is [...]. If E1 has an unsigned type, [...]. If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined." So ((uint8_t)0xFF) << 24 is UB in C assuming a 32-bit int.
5

In the expression buffer[0] << 24 the value 24 is an int, so buffer[0] will also be converted to an int before the shift is performed.

On your system a char is apparently signed, and will then be sign extended when converted to int.

Comments

5

I think this could be also done with use of memcpy:

int buffToInteger(char* buffer) { int a; memcpy( &a, buffer, sizeof( int ) ); return a; } 

This is much faster than the example mentioned in the original post, because it just treats all bytes "as is" and there is no need to do any operations such as bit shift etc. It also doesn't cause any signed-unsigned issues.

1 Comment

This could cause endianness issues, as the bytes could be in little-endian order and the integer in big endian order.
3

There's a implict promotion to a signed int in your shifts. That's because char is (apparently) signed on your platform (the common thing) and << promotes to integers implicitly. In fact none of this would work otherwise because << 8 (and higher) would scrub all your bits!

If you're stuck with using a buffer of signed chars this will give you what you want:

#include <iostream> #include <iomanip> int buffToInteger(char * buffer) { int a = static_cast<int>(static_cast<unsigned char>(buffer[0]) << 24 | static_cast<unsigned char>(buffer[1]) << 16 | static_cast<unsigned char>(buffer[2]) << 8 | static_cast<unsigned char>(buffer[3])); return a; } int main(void) { char buff[4]={0x0,0x0,0x3e,static_cast<char>(0xe3)}; int a=buffToInteger(buff); std::cout<<std::hex<<a<<std::endl; // your code goes here return 0; } 

Be careful about bit shifting on signed values. Promotions don't just add bytes but may convert values.

For example a gotcha here is that you can't use static_cast<unsigned int>(buffer[1]) (etc.) directly because that converts the signed char value to a signed int and then reinterprets that value as an unsigned.

If anyone asks me all implicit numeric conversions are bad. No program should have so many that they would become a chore. It's a softness in the C++ inherited from C that causes all sorts of problems that far exceed their value. It's even worse in C++ because they make the already confusing overloading rules even more confusing.

2 Comments

Does it also work on converting 4bytes to unsigned integer ? or it needs modification ?
@atari83 Apart from the obvious changes to the return value and inside the function it should be fine. It only really depends on the characters representing a little-endian array containing an unsigned integer.
2
char buffer[4]; int a; a = *(int*)&buffer; 

This takes a buffer reference, type casts it to an int reference and then dereferences it.

3 Comments

Hi Mylotyrena, welcome to Stack Overflow! Please explain why your code solves the problem.
This is probably undefined behavior since the buffer may not be aligned to int's 4-byte alignment.
In principle, this will probably work, but it's a pretty poor way to write it in C++ at the time this was answered, even at the time it was written. You really shouldn't still be using c-style casts for what you're trying to do here. If anything, the last line should be a = *reinterpret_cast<int *>(&buffer);. The intention is much clearer!
1
int buffToInteger(char * buffer) { return *reinterpret_cast<int*>(buffer); } 

This conversion is simple and fast. We only tell compiler to treat a byte array in a memory as a single integer

1 Comment

Please don't post only code as answer, but also provide an explanation what your code does and how it solves the problem of the question. Answers with an explanation are usually more helpful and of better quality, and are more likely to attract upvotes.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.