Can a pointer (address) ever be negative?

Question

I have a function that I would like to be able to return special values for failure and uninitialized (it returns a pointer on success).

Currently it returns NULL for failure, and -1 for uninitialized, and this seems to work... but I could be cheating the system. IIRC, addresses are always positive, are they not? (although since the compiler is allowing me to set an address to -1, this seems strange).

[update]

Another idea I had (in the event that -1 was risky) is to malloc a char @ the global scope, and use that address as a sentinel.

What are you trying to do with this? What's interface being used for. Combining so many different kinds of outputs on the return value seems like the wrong way to do this. — Ken Bloom
– Ken Bloom, Commented Jul 22, 2010 at 0:10
@kirk.burleson: int* foo(){ return -1;} gives a warning on G++ warning: return makes pointer from integer without a cast. I'm not sure whether that proves or disproves your point, but I'm still irked when you say "C compilers don't care what you feed 'em and they'll try to compile anything." (In g++, that's an error error: invalid conversion from ‘int’ to ‘int*’, BTW.) — Ken Bloom
– Ken Bloom, Commented Jul 22, 2010 at 3:08
@kirk.burleson: in my last comment, the warning is on GCC, the error is on G++. — Ken Bloom
– Ken Bloom, Commented Jul 23, 2010 at 15:54
Why not return a simple two value struct with one value being the pointer and the second value being a status code? It will not be that much more trouble and provide much greater flexibility and would be thread safe as well. — Richard Chambers
– Richard Chambers, Commented Sep 13, 2016 at 11:23

caf · Accepted Answer · 2010-07-22 00:15:24Z

83

No, addresses aren't always positive - on x86_64, pointers are sign-extended and the address space is clustered symmetrically around 0 (though it is usual for the "negative" addresses to be kernel addresses).

However the point is mostly moot, since C only defines the meaning of < and > pointer comparisons between pointers that are to part of the same object, or one past the end of an array. Pointers to completely different objects cannot be meaningfully compared other than for exact equality, at least in standard C - if (p < NULL) has no well defined semantics.

You should create a dummy object with static storage duration and use its address as your unintialised value:

extern char uninit_sentinel; #define UNINITIALISED ((void *)&uninit_sentinel)

It's guaranteed to have a single, unique address across your program.

edited Jul 22, 2010 at 0:15

answered Jul 22, 2010 at 0:00

caf

241k42 gold badges344 silver badges479 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Jens Gustedt Over a year ago

+1 Just to mention a modification of this idea. The dedicated sentinel has the disadvantage that you have to instantiate it in one of your objects. If you just want to have a macro you could use the address of a known system variable of which you judge to be not a valid result of your function. There are not too many such variables defined, but on a POSIX system e.g environ would do the trick.

Michael F Over a year ago

@caf: could you point to a resource to verify that pointers in amd64 are sign-extended? Never have I read anything that implies or states this. Perhaps you are referring to the requirement that canonical addresses must have bits 48 through 63 of any virtual address to be copies of bit 47? If this is what you mean, it does not imply "negative" pointers. Neither does RIP-relative addressing.

caf Over a year ago

@mfukar: Personally I find the architecture ABI to be a more authoritative document than Wikipedia. In the end, as I'm sure you know, it is ultimately a matter of interpretation or how you conceptualise the address space anyway.

caf Over a year ago

@mfukar: Perhaps you could enlighten us as to what the "negative half of the address space" refers to then, if not the obvious. (And at least I haven't been retconning my comments!)

Pete Kirkham Over a year ago

The 'negative half of the address space' refers to the x64 variant of the System V OS specification's interpretation of the bit pattern as a signed value, and as such applies only to environments which choose to make that interpretation. It could easily have been written to say 'addresses with the msb set'. Windows takes an unsigned interpretation, and calls such addresses the high address space. There is nothing in the x64 instruction set which implies that the value should be interpreted one way or the other, though most OS I've seen use unsigned - +1 for finding a signed pointer environment

|

James McNellis · Accepted Answer · 2010-07-22 00:15:50Z

The valid values for a pointer are entirely implementation-dependent, so, yes, a pointer address could be negative.

More importantly, however, consider (as an example of a possible implementation choice) the case where you are on a 32-bit platform with a 32-bit pointer size. Any value that can be represented by that 32-bit value might be a valid pointer. Other than the null pointer, any pointer value might be a valid pointer to an object.

For your specific use case, you should consider returning a status code and perhaps taking the pointer as a parameter to the function.

Careful though, if your pointer is too negative it might end up adressing the machine next to your current one.

Prof. Falken · Accepted Answer · 2011-11-15 08:14:31Z

18

It's generally a bad design to try to multiplex special values onto a return value... you're trying to do too much with a single value. It would be cleaner to return your "success pointer" via argument, rather than the return value. That leaves lots of non-conflicting space in the return value for all of the conditions you want to describe:

int SomeFunction(SomeType **p) { *p = NULL; if (/* check for uninitialized ... */) return UNINITIALIZED; if (/* check for failure ... */) return FAILURE; *p = yourValue; return SUCCESS; }

You should also do typical argument checking (ensure that 'p' isn't NULL).

edited Nov 15, 2011 at 8:14

Prof. Falken

25.1k20 gold badges105 silver badges179 bronze badges

answered Jul 22, 2010 at 0:10

JaredReisinger

7,1931 gold badge24 silver badges21 bronze badges

7 Comments

Ken Bloom Over a year ago

This is absolutely the right way to design this function. Anything else will be a maintenance disaster and a bug magnet for anyone else using the code, and should be strongly dis-recommended.

user180247 Over a year ago

Possibly. The guy who "invented" null pointers said it was a mistake, IIRC. Another special-case value may be a problem. Even so, sometimes using two separate values where one will do leads to overcomplex code. A common approach for simplifying some common algorithms is to assign special-case past-the-end objects, for instance, rather than use nulls - it avoids special-case null checks. Having a "valid" flag still needs those at-the-end checks, just in a different form. A valid pointer to a special object is a special-case pointer, and often saves a lot of complexity.

James McNellis Over a year ago

"The guy" is C.A.R. Hoare. On the other hand, he more than made up for the "billion dollar mistake" with the invention of Quicksort :-)

user180247 Over a year ago

@James - all those guys, they're just guys, you know? I probably should remember Hoare, but the who is just history. The ideas are more important. Also, I find it helps to be vague - hard for people to contradict me when they don't know who I'm quoting ;-)

JaredReisinger Over a year ago

@Steve314: Yes, in a very specific context (where you can control SomeType in my example) having some common "special case" objects/pointers can work and be a little more streamlined... but in the general case separating the status and the returned object is more maintainable.

|

AnT stands with Russia · Accepted Answer · 2010-07-22 02:04:09Z

The C language does not define the notion of "negativity" for pointers. The property of "being negative" is a chiefly arithmetical one, not in any way applicable to values of pointer type.

If you have a pointer-returning function, then you cannot meaningfully return the value of -1 from that function. In C language integral values (other than zero) are not implicitly convertible to pointer types. An attempt to return -1 from a pointer-returning function is an immediate constraint violation that will result in diagnostic message. In short, it is an error. If your compiler allows it, it simply means that it doesn't enforce that constraint too strictly (most of the time they do it for compatibility with pre-standard code).

If you force the value of -1 to pointer type by an explicit cast, the result of the cast will be implementation-defined. The language itself makes no guarantees about it. It might easily prove to be the same as some other, valid pointer value.

If you want to create a reserved pointer value, there no need to malloc anything. You can simple declare a global variable of the desired type and use its address as the reserved value. It is guaranteed to be unique.

jamesdlin · Accepted Answer · 2010-07-22 00:00:21Z

4

Pointers can be negative like an unsigned integer can be negative. That is, sure, in a two's-complement interpretation, you could interpret the numerical value to be negative because the most-significant-bit is on.

answered Jul 22, 2010 at 0:00

jamesdlin

91.1k14 gold badges185 silver badges224 bronze badges

2 Comments

WonderWorker Over a year ago

Are you saying that they can be negative when cast to a signed type e.g. int? You already know this I'm sure, but other readers might not, but an unsigned number will not store negative values. The sign bit James refers to is present in signed types only, and is precisely what makes a datatype a signed type.

onlycparra Over a year ago

WonderWorker, the bit you mention is present in signed and unsigned types, the only difference is how you interpret that bit. In signed types, that is the sign. In unsigned types, that is the second half of the range of that type

Ken Bloom · Accepted Answer · 2010-07-22 00:15:09Z

What's the difference between failure and unitialized. If unitialized is not another kind of failure, then you probably want to redesign the interface to separate these two conditions.

Probably the best way to do this is to return the result through a parameter, so the return value only indicates an error. For example where you would write:

void* func(); void* result=func(); if (result==0) /* handle error */ else if (result==-1) /* unitialized */ else /* initialized */

Change this to

// sets the *a to the returned object // *a will be null if the object has not been initialized // returns true on success, false otherwise int func(void** a); void* result; if (func(&result)){ /* handle error */ return; } /*do real stuff now*/ if (!result){ /* initialize */ } /* continue using the result now that it's been initialized */

I'm not specifically returning uninitialized. I'm working with a linked list, which is passed in as an argument, but which may or may not be initialized. Previously I had it set to NULL initially, but this conflicted with my returning "NULL" for failure. Thanks for your suggestions.

Randolpho · Accepted Answer · 2010-07-22 00:02:39Z

@James is correct, of course, but I'd like to add that pointers don't always represent absolute memory addresses, which theoretically would always be positive. Pointers also represent relative addresses to some point in memory, often a stack or frame pointer, and those can be both positive and negative.

So your best bet is to have your function accept a pointer to a pointer as a parameter and fill that pointer with a valid pointer value on success while returning a result code from the actual function.

Sure? The relative offset is usually an int in my experience.

user180247 · Accepted Answer · 2010-07-22 00:09:31Z

James answer is probably correct, but of course describes an implementation choice, not a choice that you can make.

Personally, I think addresses are "intuitively" unsigned. Finding a pointer that compares as less-than a null pointer would seem wrong. But ~0 and -1, for the same integer type, give the same value. If it's intuitively unsigned, ~0 may make a more intuitive special-case value - I use it for error-case unsigned ints quite a lot. It's not really different (zero is an int by default, so ~0 is -1 until you cast it) but it looks different.

Pointers on 32-bit systems can use all 32 bits BTW, though -1 or ~0 is an extremely unlikely pointer to occur for a genuine allocation in practice. There are also platform-specific rules - for example on 32-bit Windows, a process can only have a 2GB address space, and there's a lot of code around that encodes some kind of flag into the top bit of a pointer (e.g. for balancing flags in balanced binary trees).

ruslik · Accepted Answer · 2010-07-22 07:54:24Z

Actually, (at least on x86), the NULL-pointer exception is generated not only by dereferencing the NULL pointer, but by a larger range of addresses (eg, first 65kb). This helps catching such errors as

int* x = NULL; x[10] = 1;

So, there are more addresses that are garanteed to generate the NULL pointer exception when dereferenced. Now consider this code (made compilable for AndreyT):

#include <stdlib.h> #include <stdio.h> #include <string.h> #define ERR_NOT_ENOUGH_MEM (int)NULL #define ERR_NEGATIVE (int)NULL + 1 #define ERR_NOT_DIGIT (int)NULL + 2 char* fn(int i){ if (i < 0) return (char*)ERR_NEGATIVE; if (i >= 10) return (char*)ERR_NOT_DIGIT; char* rez = (char*)malloc(strlen("Hello World ")+sizeof(char)*2); if (rez) sprintf(rez, "Hello World %d", i); return rez; }; int main(){ char* rez = fn(3); switch((int)rez){ case ERR_NOT_ENOUGH_MEM: printf("Not enough memory!\n"); break; case ERR_NEGATIVE: printf("The parameter was negative\n"); break; case ERR_NOT_DIGIT: printf("The parameter is not a digit\n"); break; default: printf("we received %s\n", rez); }; return 0; };

this could be useful in some cases. It won't work on some Harvard architectures, but will work on von Neumann ones.

I'm not sure that's true "on x86" so much as on modern operating systems. The chip provides the ability to map a process address space to a physical address space etc, but it's the OS that usually decides which parts of the process address space are valid.
This will not even compile. Some C compilers with rather loose error checking will let you assign an integer value to a pointer (even though it is illegal in C), but none I know of will let you use a pointer as a controlling value for switch statement.

R.. GitHub STOP HELPING ICE · Accepted Answer · 2010-08-07 12:19:35Z

Do not use malloc for this purpose. It might keep unnecessary memory tied up (if a lot of memory is already in use when malloc gets called and the sentinel gets allocated at a high address, for example) and it confuses memory debuggers/leak detectors. Instead simply return a pointer to a local static const char object. This pointer will never compare equal to any pointer the program could obtain in any other way, and it only wastes one byte of bss.

Are you sure that the const char objects can never be coalesced like string literals can? (If you're sure they can't, perhaps you'd like to answer my question on the subject which (in my mind) had no real consensus.)

phuclv · Accepted Answer · 2020-10-14 04:16:42Z

You don't need to care about the signness of a pointer, because it's implementation defined. The real question here is "how to return special values from a function returning pointer?" which I've explained in detail in my answer to the question Pointer address span on various platforms

In summary, the all-one bit pattern (-1) is (almost) always safe, because it's already at the end of the spectrum and data cannot be stored wrapped around to the first address, and the malloc family never returns -1. In fact this value is even returned by many Linux system calls and Win32 APIs to indicate another state for the pointer. So if you need just failure and uninitialized then it's a good choice

But you can return far more error states by utilizing the fact that variables must be aligned properly (unless you specified some other options). For example in a pointer to int32_t the low 2 bits are always zero which means only ¹⁄₄ of the possible values are valid addresses, leaving all of the remaining bit patterns for you to use. So a simple solution would be just checking the lowest bit

int* result = func(); if (!result) error_happened(); else if ((uintptr_t)result & 1) uninitialized();

In this case you can return both a valid pointer and some additional data at the same time

You can also use the high bits for storing data in 64-bit systems. On ARM there's a flag that tells the CPU to ignore the high bits in the addresses. On x86 there isn't a similar thing but you can still use those bits as long as you make it canonical before dereferencing. See Using the extra 16 bits in 64-bit pointers

Collectives™ on Stack Overflow

Can a pointer (address) ever be negative?

[update]

13 Answers 13

11 Comments

1 Comment

7 Comments

Comments

2 Comments

1 Comment

1 Comment

Comments

2 Comments

2 Comments

Comments

2 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

[update]

13 Answers 13

11 Comments

1 Comment

7 Comments

Comments

2 Comments

1 Comment

1 Comment

Comments

2 Comments

2 Comments

Comments

2 Comments

Comments

Linked

Related