11

How do you explain that line 7 gets a warning, but not line 5 or line 6?

int main() { unsigned char a = 0xFF; unsigned char b = 0xFF; a = a | b; // 5: (no warning) a = (unsigned char)(b & 0xF); // 6: (no warning) a = a | (unsigned char)(b & 0xF); // 7: (warning) return 0; } 

GCC 4.6.2 output when compiled on 32-bit architecture (Windows PC):

gcc -c main.c --std=c89 -Wall -Wextra -Wconversion -pedantic main.c: In function 'main': main.c:7:11: warning: conversion to 'unsigned char' from 'int' may alter its value [-Wconversion] 

If this helps you understand my question, here is how I see this (probably incorrect!):

I suppose that on a 32-bit machine operations are done on 32-bit numbers. Since unsigned char fits into 32-bit int, the operation result is 32-bit int. But since GCC doesn't give warnings on lines 5 and 6, I guess there is something else going on:

line 5: GCC figures that (uchar) OR (uchar) is never bigger than MAX(uchar), so no warning.

line 6: GCC figures that (uchar) AND 0xF is never bigger than MAX(uchar), so no warning. Explicit cast is not even necessary.

line 7: Based on assumptions above: AND should not give warning (since line 6), OR should not give warning either (since line 5).

I guess my logic is faulty somewhere there. Help me understand the logic of the compiler.

6
  • 3
    This looks like a bug in the compiler: clang-based compiler on mac produces a warning-free compile with the settings that you specified. Commented Jun 21, 2012 at 9:19
  • No warning from GCC 4.4.5 on Linux/x86-64. Commented Jun 21, 2012 at 9:37
  • Can anyone confirm that they are getting the same warning as I do? Commented Jun 21, 2012 at 9:41
  • 1
    Yes, I get the same warning with: GCC 4.4.3/linux. Commented Jun 21, 2012 at 9:42
  • 2
    I think the answer may be somewhere here: Why does Wconversion emit a warning in an implicit conversion between variables even when it is known at compile time that the value does not change?: Warning because there is no flow control in front-ends (so we don't know the value of d). Commented Jun 21, 2012 at 11:20

4 Answers 4

2

Compilers are built by people and they don't have infinite time to figure out all arithmetic possibilities to decide, which cases are worth issuing a warning.

So I believe (attention opinion) that compiler engineers would go the following way:

  • generally issue a warning if code looks as if it could be wrong.
  • find all obvious cases where the compiler can be corrected to work easily.
  • leave the rest of the warnings as false positives, because the person either knows what he's doing or will be relieved that the compiler is warning.

I would expect people to write code where either the result is casted to (unsigned char) or where the outermost operator masks all higher bytes with a constant.

  • a = (unsigned char) ( /* some obscure bit-wise expressoin */ ); would be OK then
  • a = 0xff & ( /* some obscure bit-wise expressoin */ ); also OK

if you know that your compiler translates those two patterns correctly the other cases shouldn't bother you too much.

I've seen compilers that would issue a warning because of a = a | b; so GCC not giving a warning is a free bonus. it might be, that gcc just infers the constant assignment in a | b and therefore replaces it with 0xff | 0xff which is known to work without problems. If that happens though I don't know why it cannot derive the constant value of a in the other statements.

Sign up to request clarification or add additional context in comments.

2 Comments

+1 For good answer, thanks! Best so far and if I won't get any better in some time, I will mark it as an Answer.
I think Alex's last paragraph contains the important piece of information. gcc doesn't throw an error for line 5 because it performs the 'OR' operation at compile time and optimizes away that line of code. I suspect that if you swap lines 7 and 5, you'll see the compiler throw warnings for different pieces of code.
0

I use linux x86_64, GCC 4.70. And get the same error. I compile the code, and use gdb to disassemble the execution file. Here is what I get.

(gdb) l 1 int main(){ 2 unsigned char a = 0xff; 3 unsigned char b = 0xff; 4 a = a | b; 5 a = (unsigned char)(b & 0xf); 6 a |= (unsigned char)(b & 0xf); 7 return 0; 8 } (gdb) b 4 Breakpoint 1 at 0x4004a8: file test.c, line 4. (gdb) b 5 Breakpoint 2 at 0x4004af: file test.c, line 5. (gdb) b 6 Breakpoint 3 at 0x4004b9: file test.c, line 6. (gdb) r Starting program: /home/spyder/stackoverflow/a.out Breakpoint 1, main () at test.c:4 4 a = a | b; (gdb) disassemble Dump of assembler code for function main: 0x000000000040049c <+0>: push %rbp 0x000000000040049d <+1>: mov %rsp,%rbp 0x00000000004004a0 <+4>: movb $0xff,-0x1(%rbp) 0x00000000004004a4 <+8>: movb $0xff,-0x2(%rbp) => 0x00000000004004a8 <+12>: movzbl -0x2(%rbp),%eax 0x00000000004004ac <+16>: or %al,-0x1(%rbp) 0x00000000004004af <+19>: movzbl -0x2(%rbp),%eax 0x00000000004004b3 <+23>: and $0xf,%eax 0x00000000004004b6 <+26>: mov %al,-0x1(%rbp) 0x00000000004004b9 <+29>: movzbl -0x2(%rbp),%eax 0x00000000004004bd <+33>: mov %eax,%edx 0x00000000004004bf <+35>: and $0xf,%edx 0x00000000004004c2 <+38>: movzbl -0x1(%rbp),%eax 0x00000000004004c6 <+42>: or %edx,%eax 0x00000000004004c8 <+44>: mov %al,-0x1(%rbp) 0x00000000004004cb <+47>: mov $0x0,%eax 0x00000000004004d0 <+52>: pop %rbp 0x00000000004004d1 <+53>: retq End of assembler dump. 

the a = a | b is compiled to

movzbl -0x2(%rbp),%eax or %al,-0x1(%rbp) 

the a = (unsigned char)(b & 0xf) is compiled to

mov %al,-0x2(%rbp) and $0xf,%eax mov %al,-0x1(%rbp) 

the a |= (unsigned char)(b & 0xf); is compiled to

movzbl -0x2(%rbp),%eax mov %eax,%edx and $0xf,%edx movzbl -0x1(%rbp),%eax or %edx,%eax mov %al,-0x1(%rbp) 

the explict cast didn't appear in asm code. The problem is when (b & 0xf) operation is done. the output of operation is sizeof(int). So you should use this instead:

a = (unsigned char)(a | (b & 0xF)); 

PS: explict cast dont generate any warning. even you will lose something.

6 Comments

movzbl -offset(%rbp),%eax and mov %al,-0x1(%rbp) do the casts effectively. And I don't think the OP should write their code any differently. It's valid and reasonable C code.
@Alex look at this or %al,-0x1(%rbp) and and $oxf, %eax,the difference means the second need a explict cast. and the third explict cast doesn't do any thing because after and $0xf,%edx instruction, there is no memory operation for the result.
At any rate, I don't think this disassembly proves anything. The generated code is correct.
a = a | b is equal to a |= b, It just two members. both are 8 bit.
a and b are promoted to int before | (and |=) and the value of a|b is of type int. That's what it should be according to the C standard. The compiler has plenty of freedom in deciding of how to generate the code that does what the standard prescribes. It may be clever enough to compile this | into instructions with 8-bit operands only. It may be clever enough to compile it into faster instructions with 32-bit operands or a mix of 32-bit and 8-bit operands. As long as the program's behavior remains compliant to the standard, the compiler can do anything. The disasm proves nada.
|
0

The return type of bitwise operator & is integer. Whenever you cast an int (4 bytes) into char or unsigned char (1 byte) it gives you warning.

So this is not related to bitwise operator it is related to typecasting from 4 bytes variable to 1 bytes variable.

6 Comments

I agree with Omer I had the same problem as you not too long ago. The result of any bitwise operation is elevated to the size of a register in C. if you are on a 32 bit machine it's 4 bytes.
@Alex Please re-read my question. I am not asking why compiler gives the warning. I am asking why it treats lines 5/6/7 differently. Your post is just stating the obvious and is not answering my question at all.
@Alex I edited Omers post (he didn't approve so far): if you do bitwise arithmetic with constants, the compiler is able to derive the result size directly and can check whether the result would overflow. Thus Line 5 doesn't get a warning. However many compilers aren't good at arithmetic optimizing. and it would give a warning whenever the optimizer doesn't give a clear result.
@Alex I'm not sure what you mean. You are saying that if there is a constant, then compiler will be able to derive result size and do the check, therefore line 5 gets no warning. First of all, line 5 doesn't have any constants in it. Furthermore, line 7 actually uses a constant and it gives a warning, but if you change the constant to a variable (i.e. a = a | (unsigned char)(b & a);), then the warning will disappear. I guess I should wait for your edit to appear.
Alex talking to Alex? Good grief this is confusing!
|
-1

I think the problem is that you convert int to unsigned char, AND back to int.

Line 6 converts int to unsigned char, but just stores it into unsigned char.
Line 7 converts int to unsigned char, and then, in order to do arithmetic, converts it back to int. The new integer may be different from the original, so you get a warning.

7 Comments

Did you consider line 5? Using your logic, shouldn't line 5 get a warning? "Line 5 takes unsigned char and then, in order to do arithmetic, converts it to int [...]"
Per the rules of C all 3 expressions to the right of the assignment operator are of type int. They aren't different in that respect.
@Alex, only line 7 converts from int to unsigned char AND back to int.
OK, I'm wrong about 2nd case, but sizeof(a | b) == sizeof(a | (unsigned char)(b & 0xF)) == sizeof(int).
In a|b, unsigned char is converted to int, that's all. In a|(unsigned char)(b&0xf), b&0x0f, which is int, is converted to unsigned char, then back to int.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.