29

I know C compilers aren't required to use all zeros for the bit representation of NULL, but they *are* required by the standard to make NULL evaluate to false in boolean contexts/comparisons. Hence the 2nd printf in the program below will always output false.

But what I want to know is: on systems where NULL is *not* all zeros, will a pointer value that *is* all zeros also evaluate to false in boolean contexts/comparisons? In other words, will the 1st printf in the program below ever output true?

Or asked in a slightly different way: can I rely on calloc to produce a pointer value that will always evaluate to false in boolean contexts/comparisons? The 1st answer to this question uses memset to clear the bits of a long* named y, then goes on to say that y==0 is UB because y may be a "trap representation" (whatever that is). calloc is also just clearing bits, so maybe o->p in the 1st printf is also UB?


#include <stdio.h> #include <stdlib.h> #include <assert.h> typedef struct { void * p; } obj; int main() { obj * o = calloc(sizeof(obj), 1); assert(o); // assume successful allocation printf("%s\n", o->p ? "true" : "false"); // 1st: could print "true"? Is o->p UB? o->p = NULL; printf("%s\n", o->p ? "true" : "false"); // 2nd: always prints "false" return 0; } 
13
  • 2
    Are you aware of an architecture where a null pointer isn't all zero bits? I don't think I've ever seen one, and that includes weirdos like segmented 16-bit x86. Commented Aug 27, 2020 at 18:27
  • 1
    Note that a pointer in C isn't necessarily really a number, it just behaves in some cases as one (you can do arithmetics, etc.). That's why C also doesn't define %p for printf, as there doesn't need to be a consistent representation of "pointer". FWIW, a pointer could be literally an arrow pointing to some "object", C is very abstract in that regard. Thus, asking about a "bit representation" usually isn't really meaningful for a strict reading of the standard. Commented Aug 27, 2020 at 21:03
  • 1
    @textral How do you conclude the intent of the programmer from here? I can only tell that the programmer's intent was to write an all-bits-zero value there. If the intent was to have a null pointer or not, I cannot know. In doubt, I hhave to assume the programmer knows his target platform and knows what an all-zero pattern is and what not. Commented Aug 28, 2020 at 10:25
  • 2
    @MarkRansom I worked on a C compiler for the CDC Cyber 180 line. These machines were intended to run a Multics-like operating system and so its 48-bit pointers included a 4 bit ring number. Only code running in ring 0 could create pointers that had the ring number set to 0. Hence we had null pointers which were not all 0's. If a pointer was used in a branch, we would move it to an integer register and mask out the ring number before testing it for all zeros; the second println statement would print "false". The first println accesses an uninitialized location and so is undefined behaviour. Commented Aug 28, 2020 at 15:35
  • 1
    @TheodoreNorvell thanks for the specific example. I learned assembler on a CDC Cyber 6400, and of course it worked completely differently - the address registers were only 18 bit, and there was no ring concept. Each process had its own address space, and supervisory functions were handled by a separate set of peripheral processors. Commented Aug 28, 2020 at 16:02

5 Answers 5

13
typedef struct { void * p; } obj; obj * o = calloc(sizeof(obj), 1); assert(o); // Let us set aside the case of a failed allocation printf("%s\n", o->p ? "true" : "false"); // 1st: could print "true" ? 

can I rely on calloc to produce a pointer value that will always evaluate to false in boolean contexts/comparisons?

No - output could be "true".*1.

The bit pattern of all zeros, as a pointer, may not be a null pointer.

7.22.3.2 The calloc function
2 The calloc function allocates space for an array of nmemb objects, each of whose size is size. The space is initialized to all bits zero.301)
Footnote 301) Note that this need not be the same as the representation of floating-point zero or a null pointer constant.


Example: An implementation may only have only a single null pointer encoding with a bit pattern of all ones. (void *)0 converts the all zeros bit pattern int 0 to an all ones void *. if (null_pointer) is always false, regardless of the bit pattern of the null pointer.


*1 Yet practically yes, output is always "false". Implementations are uncommon these days that do not use all zero bit pattern as a null pointer. Highly portable code would not assume this practicality. Consider an old or new novel system may use a zero bit pattern as a non-null pointer - and sadly break many a code base that assumes an all zero bit pattern is a null pointer.

Sign up to request clarification or add additional context in comments.

15 Comments

Thanks @chux, that justifies my uneasiness in using this code (though as you say, there are practically no implementatons out there that behave in such an unintuitive way). Is the other answer I linked to in my question right: is it considered UB to even reference such a value?
And also, is the community answer that @Howlium linked to wrong? Or am I just misreading it? (I need to read it more closely... maybe they never refer to initialization at the bit level.)
In the second code fragment, the 6.3.2.3 3 isn't relevant. The expression o->i isn't an integer constant expression.
@chux-ReinstateMonica: An implementation may specify the behavior of dereferencing a null pointer--sometimes usefully. On typical compilers for certain Motorola/Freescale microcontrollers where the control register for port A is located at address zero, attempting to store 0x42 to (unsigned char)0` would set the PORT A control register to 0x42, and that would be the normal way of setting that register's value (perhaps *((unsigned short*)0xFFFF) = 0x42; would work if hardware ignored writes to 0xFFFF, but that seems jankier than writing to a literal address zeor.
Sorry for the long pause @chux. Going back to our conversation (in the comments): so if the all-zeros bit pattern written by memset happened to be a null pointer value according to the C implementation, then referencing y and comparing it to 0 is well defined. But the code may find it's way onto a system where the C impl only uses non-zero bit null pointers, in which case y==0 is comparing a non-null ptr to zero, so 0 is not converted to a pointer type, and hence the comparison is UB. Do I understand correctly?
|
6

Background information

Consider the following places where the logical value of an expression is used, all taken from C18, my emphasis in bold italic:

  • 6.3.1.2 (Boolean type) p1: When any scalar value is converted to _Bool, the result is 0 if the value compares equal to 0; otherwise, the result is 1.

  • 6.5.3.3 (Unary arithmetic operators) p5: The result of the logical negation operator ! is 0 if the value of its operand compares unequal to 0, 1 if the value of its operand compares equal to 0. The result has type int. The expression !E is equivalent to (0==E).

  • 6.5.13 (Logical AND operator) p3: The && operator shall yield 1 if both of its operands compare unequal to 0; otherwise, it yields 0. The result has type int.

  • 6.5.14 (Logical OR operator) p3: The || operator shall yield 1 if either of its operands compare unequal to 0; otherwise, it yields 0. The result has type int.

  • 6.5.15 (Condtional operator) p4: The first operand is evaluated; there is a sequence point between its evaluation and the evaluation of the second or third operand (whichever is evaluated). The second operand is evaluated only if the first compares unequal to 0; the third operand is evaluated only if the first compares equal to 0; the result is the value of the second or third operand (whichever is evaluated), converted to the type described below.

  • 6.8.4.1 (The if statement) p2: In both forms, the first substatement is executed if the expression compares unequal to 0. In the else form, the second substatement is executed if the expression compares equal to 0. If the first substatement is reached via a label, the second substatement is not executed.

  • 6.8.5 (Iteration statements) p4: An iteration statement causes a statement called the loop body to be executed repeatedly until the controlling expression compares equal to 0. The repetition occurs regardless of whether the loop body is entered from the iteration statement or by a jump.

"E compares equal to 0" is equivalent to the C expression (E == 0), and "E compares unequal to 0" is equivalent to the C expression (E != 0). The constraints of the equality operators are given by:

  • 6.5.9 (Equality operators) p2: One of the following shall hold:
    • both operands have arithmetic type;
    • both operands are pointers to qualified or unqualified versions of compatible types;
    • one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void; or
    • one operand is a pointer and the other is a null pointer constant.

Regarding the semantics of the equality operators where at least one operand is a pointer:

  • 6.5.9 (Equality operators) p5: Otherwise, at least one operand is a pointer. If one operand is a pointer and the other is a null pointer constant, the null pointer constant is converted to the type of the pointer. If one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void, the former is converted to the type of the latter.

  • p6: Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

Regarding null pointer constants:

  • 6.3.2.3 (Pointers) p3: An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant67). If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

OP's questions

But what I want to know is: on systems where NULL is not all zeros, will a pointer value that is all zeros also evaluate to false in boolean contexts/comparisons?

Aside: NULL is a null pointer constant, not necessarily a null pointer (see 6.3.2.3p3 above where it could be an integer constant expression). What you really mean is a system where the bit representation of a null pointer is not all zeros.

Note: As pointed out by Eric Postpischil in the comments below, a system could have several bit representations of null pointer values, so we assume that none of them are all-zero bit representations for this question.

In order for the pointer value to evaluate to false in boolean contexts/comparisons, it must compare unequal to 0. In this context, it must compare unequal to a null pointer constant. By 6.5.9p5 above, the null pointer constant will be converted to the type of the pointer it is being compared to. By 6.5.9p6 above, a null pointer value will not compare equal to a non-null pointer value. So a non-null pointer value with all bits zero on a system where a null pointer value is not all bits zero will evaluate to true in a boolean context.

Or asked in a slightly different way: can I rely on calloc to produce a pointer value that will always evaluate to false in boolean contexts/comparisons?

No, you cannot rely on calloc (or memset with byte value 0) to produce a pointer value that will evaluate to false in boolean contexts. If a pointer value with an all-zero bit representation is not a null pointer value it will evaluate to true in boolean contexts.

2 Comments

“If a null pointer has a not all-zero bit representation, a pointer value with an all-zero bit representation will evaluate to true in boolean contexts” is not a valid inference. An implementation may have multiple bit patterns that constitute null pointers, including one that is all zeros and some that are not. This answer (and others here) fail to to be clear in this regard.
@EricPostpischil OK I should change that to say a non-null pointer value with an all-zero bit representation will evaluate to true in boolean contexts.
3

There's a great discussion of NULL and 0 in the first answer to this question: What is the difference between NULL, '\0' and 0?

The punchline in that answer is:

Note that what is a null pointer in the C language. It does not matter on the underlying architecture. If the underlying architecture has a null pointer value defined as address 0xDEADBEEF, then it is up to the compiler to sort this mess out.

…Even on this funny architecture, the following ways are still valid ways to check for a null pointer:

if (!pointer) if (pointer == NULL) if (pointer == 0) 

And in the second answer to the same question…

A constant expression of type int with the value 0, or an expression of this type, cast to type void * is a null pointer constant, which if converted to a pointer becomes a null pointer. It is guaranteed by the standard to compare unequal to any pointer to any object or function.

(Short answer, yes, you can check for a NULL pointer with if (!ptr)).

5 Comments

But this doesn't answer to the crucial question: will the 1st printf in the program below ever output true?
Wait a minute @Howlium, I found another answer that now confuses me (stackoverflow.com/questions/21386995/is-int-0-a-null-pointer). The 1st answer uses memset to set the bits of a long* (y) to 0, then goes on to say that y == 0 is UB because y may be a trap representation. calloc clears bits just like memset does in that answer, so why is it any different?
@M.NejatAydin I gess not. The compiler knows the data types of the struct and will act accordingly.
@stderr I'm not convinced. The assignment obj *o = calloc(sizeof(obj), 1) sets all bits of the pointer o->p to zero. But that object representation (all bits zero) doesn't have to represent a null pointer.
@M.NejatAydin No, you're right. See the answer chux provided. C89 says the same. Footnote 127: Note that this need not be the same as the representation of floating-foint zero or a null pointer.
2

Core Answer

But what I want to know is: on systems where NULL is *not* all zeros, will a pointer value that *is* all zeros also evaluate to false in boolean contexts/comparisons?

In a C implementation, the C standard allows any of:

  • All-bits-zero is a null pointer and no other bit pattern is.
  • All-bits-zero is a null pointer and one or more other bit patterns are.
  • All-bits-zero is not a null pointer and one or more other bit patterns are.

In other words, a C implementation may designate any one or more bit patterns to be null pointers, and this may or may not include all-bits-zero. (If the C implementation does allow multiple bit patterns to be null pointers, it must ensure they compare equal.)

… will the 1st printf in the program below ever output true?

It is allowed that it print “true”; the result of calloc is memory with all bits zero, and interpreting that memory as a void * may result in a pointer value that is not a null pointer value.

Supplement

… where NULL is *not* all zeros…

NULL is only something in source code. It is either 0 or ((void *) 0) or an equivalent. Wherever it is used as a pointer in source code (that is, you are doing normal things like if (pointer != NULL), not kludges like int x = 3 + NULL;), the compiler effectively converts it to a null pointer. That is, if all-bits-zero is not a null pointer in the C implementation, the compiler will compile pointer != NULL to a comparison of pointer to some bit pattern that does represent a null pointer.

So your questions are all about null pointers; they are not about NULL.

… on systems where…

The final determination of what is a null pointer lies with the C implementation, not the system it executes on. A C implementation may represent pointers in any way it wants and transform them as necessary when using machine addresses in instructions.

Comments

0

You can avoid such questions with an explicit and defensive coding style.

if you have an pointer _p, write constructs like

 (_p==NULL)?(A):(B) 

now any reader knows immediately, your intend is to check if _p is equal to NULL and even on a machine where NULL might be different from an integer value of 0, the compiler will automatical do it correctly. Also an static code checker now will not warn you because of relying on implicit behaviour.

 (_p)?(A):(B) 

is just not doing it right

but apart from that, its an interesting technical question.

A interesting Talk from the C++ committee from 2019 or 2020 revealed that even these guys think about dropping compatibility to some odd undefined behaviour, which was needed back before 1970 for some 3-4 architectures. There wasn't any known use of this stuff in the last decades - at least to my knowlege. As the first comment to your questions states: You will hardly find any machine with such an issue - at least outside of an museum.

7 Comments

Whether there's anything wrong with the (_p)?(A):(B) option is strictly a question of style. If _p's value is a null pointer then that expression will evaluate to the same thing as (B), otherwise to the same thing as (A), regardless of _p's bit pattern. This is exactly equivalent to your other version. At best, saying that one "is just not doing it right" is subject to misinterpretation.
Well I explicit said: style. Not everything a compiler accepts and not everything in a standard - at least for backward compatibility - is good code. and (_p)?()() is just not good code. And as this question showed its not safe from missinterpretation.
@schnedan "(_p)?()() is just not good code" Well, that's subjective. "And as this question showed its not safe from missinterpretation" Depends on who interprets it.
If you ever happen to debug a huge project, which other people have developed, you will train yourself to produce code which is clear, defensive, maintainable,...
Any ref to the "Talk from the C++ committee from 2019 or 2020"? Would like to see what ideas are being discussed about that.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.