Portable and safe way to add byte offset to any pointer

Question

I'm quite new at working with C++ and haven't grasped all the intricacies and subtleties of the language.

What is the most portable, correct and safe way to add an arbitrary byte offset to a pointer of any type in C++11?

SomeType* ptr; int offset = 12345 /* bytes */; ptr = ptr + offset; // <--

I found many answers on Stack Overflow and Google, but they all propose different things. Some variants I have encountered:

Cast to char *:

ptr = (SomeType*)(((char*)ptr) + offset);

Cast to unsigned int:

ptr = (SomeType*)((unsigned int)ptr) + offset);

Cast to size_t:

ptr = (SomeType*)((size_t)ptr) + offset);

"The size of size_t and ptrdiff_t always coincide with the pointer's size. Because of this, it is these types that should be used as indexes for large arrays, for storage of pointers and pointer arithmetic." - About size_t and ptrdiff_t on CodeProject
```
ptr = (SomeType*)((size_t)ptr + (ptrdiff_t)offset); 
```
Or like the previous, but with intptr_t instead of size_t, which is signed instead of unsigned:
```
ptr = (SomeType*)((intptr_t)ptr + (ptrdiff_t)offset); 
```
Only cast to intptr_t, since offset is already a signed integer and intptr_t is not size_t:
```
ptr = (SomeType*)((intptr_t)ptr) + offset); 
```

And in all these cases, is it safe to use old C-style casts, or is it safer or more portable to use static_cast or reinterpret_cast for this?

Should I assume the pointer value itself is unsigned or signed?

There isn't any. It's undefined behaviour to add an arbitrary byte offset to a pointer. You can only do arithmetic on pointers that point to the same array (and one past the end of it). — jrok
– jrok, Commented Apr 10, 2013 at 18:57
@jrok It's perfectly well defined to add an arbitrary offset to a pointer. What's undefined is dereferencing a pointer that doesn't point to valid memory. — sfstewman
– sfstewman, Commented Apr 10, 2013 at 18:59
@sfstewman It won't cause errors on the implementations I know, but IIRC there's a clause prohibiting going more than one object past the end of an array (i.e. int a[5]; a + 5; is good, int a[5]; a + 6 is bad). Edit: found a source: stackoverflow.com/a/988220/395760 — user395760
– user395760, Commented Apr 10, 2013 at 19:07
@sfstewman: C++ draft n3092 5.7 5: “If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.” — Eric Postpischil
– Eric Postpischil, Commented Apr 10, 2013 at 19:12
@sfstewman You're wrong. The standard explicitly makes it UB (see the comment above). In practice, yeah, it just works, at least until you smash your own stack or something like that. — jrok
– jrok, Commented Apr 10, 2013 at 19:14

freddy.smith · Accepted Answer · 2013-04-10 18:58:16Z

17

I would use something like:

unsigned char* bytePtr = reinterpret_cast<unsigned char*>(ptr); bytePtr += offset;

answered Apr 10, 2013 at 18:58

freddy.smith

4714 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

26 Comments

Nik Bougalis Over a year ago

I would use the more condensed (reinterpret_cast<unsigned char *>(ptr) + offset) perhaps wrapped in an inline (possibly template) function, depending on how often I needed it and what the returned type ought to be.

Eric Postpischil Over a year ago

@Virtlink: unsigned char is preferred for working with bytes because the language standard requires it be a simple binary representation of the value and that all bit patterns correspond to a value. In contrast, char and signed char might use two’s complement, one’s complement, or signed magnitude and might have bit patterns that do not correspond to a value.

Eric Postpischil Over a year ago

@Virtlink: For the purposes of C and C++, an unsigned char is a byte. The allowances in the standard for the number of bits in a char to vary are for old or esoteric platforms where the memory is organized in something like 9-bit units, not so that a C or C++ implementation can give you 16-bit char objects while addressing uses 8-bit units.

Christian Rau Over a year ago

@Virtlink Yes, a char doesn't need to be 8 bits, but you know what, you don't care. char is gauranteed to be the unit in which C++ measures sizes and thus the granularity of your systems addressing. And mixing code written for an 8-bit platform (and working at that low a level) with code for a 9-bit platform is hopefully something you're not planning to do.

Christian Rau Over a year ago

@Virtlink Because the standard doesn't make any guarantees about casting to int, disturbing the int and casting back. The only thing you can do with a pointer cast to int is cast it back. Of course it will most probably work an any practical platform (in the same way any practical platform will have 8-bit chars), but it's really UB to use this pointer afterwards (and if ou don't want to use it, then why adding a offset anyway?). And in the end I don't even think anybody guarantees the pointer to convert into a byte address (again, on most practical platforms it will indeed do).

|

user2218982 · Accepted Answer · 2013-04-11 08:13:08Z

Using reinterpret_cast (or C-style cast) means circumventing the type system and is not portable and not safe. Whether it is correct, depends on your architecture. If you (must) do it, you insinuate that you know what you do and you are basically on your own from then on. So much for the warning.

If you add a number n to a pointer or type T, you move this pointer by n elements of type T. What you are looking for is a type where 1 element means 1 byte.

From the sizeof section 5.3.3.1.:

The sizeof operator yields the number of bytes in the object representation of its operand. [...] sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1. The result of sizeof applied to any other fundamental type (3.9.1) is implementation-defined.

Note, that there is no statement about sizeof(int), etc.

Definition of byte (section 1.7.1.):

The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set (2.3) and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementation-defined. [...] The memory available to a C++ program consists of one or more sequences of contiguous bytes. Every byte has a unique address.

So, if sizeof returns the number of bytes and sizeof(char) is 1, than char has the size of one byte to C++. Therefore, char is logically a byte to C++ but not necessarily the de facto standard 8-bit byte. Adding n to a char* will return a pointer that is n bytes (in terms of the C++ memory model) away. Thus, if you want to play the dangerous game of manipulating an object's pointer bytewise, you should cast it to one of the char variants. If your type also has qualifiers like const, you should transfer them to your "byte type" too.

 template <typename Dst, typename Src> struct adopt_const { using type = typename std::conditional< std::is_const<Src>::value, typename std::add_const<Dst>::type, Dst>::type; }; template <typename Dst, typename Src> struct adopt_volatile { using type = typename std::conditional< std::is_volatile<Src>::value, typename std::add_volatile<Dst>::type, Dst>::type; }; template <typename Dst, typename Src> struct adopt_cv { using type = typename adopt_const< typename adopt_volatile<Dst, Src>::type, Src>::type; }; template <typename T> T* add_offset(T* p, std::ptrdiff_t delta) noexcept { using byte_type = typename adopt_cv<unsigned char, T>::type; return reinterpret_cast<T*>(reinterpret_cast<byte_type*>(p) + delta); }

Example

where23 · Accepted Answer · 2016-06-14 07:32:39Z

Please note that, NULL is special. Adding an offset on it is dangerous.
reinterpret_cast can't remove const or volatile qualifiers. More portable way is C-style cast.
reinterpret_cast with traits like @user2218982's answer, seems more safer.

template <typename T> inline void addOffset( std::ptrdiff_t offset, T *&ptr ) { if ( !ptr ) return; ptr = (T*)( (unsigned char*)ptr + offset ); }

user13670613 · Accepted Answer · 2020-06-03 06:50:02Z

Mine isn't as elegant, but I hope is more readable. char helper_ptr; helper_ptr= (char) ptr;

Then you can traverse byte-by-byte using helper_ptr.

ptr = (SomeType*)(((char*)ptr) + 1) will advance the ptr by sizeof(SomeType) instead of 1 byte.

Mppl · Accepted Answer · 2013-04-10 19:09:33Z

-2

if you have:

myType *ptr;

and you do:

ptr+=3;

The compiler will most certainly increment your variable by:

3*sizeof(myType)

And it's the standard way to do it as far as I know.

If you want to iterate over let's say an array of elements of type myType that's the way to do it.

Ok, if you wanna cast do that using

myNewType *newPtr=reinterpret_cast < myNewType * > ( ptr )

Or stick to plain old C and do:

myNewType *newPtr=(myNewType *) ptr;

And then increment

edited Apr 10, 2013 at 19:09

answered Apr 10, 2013 at 19:01

Mppl

96110 silver badges18 bronze badges

3 Comments

Daniel A.A. Pelsmaeker Over a year ago

I know how it works when you don't cast. I want to add any byte offset (say, 0xABC bytes) to a pointer of any type MyType* regardless of its type's size. If MyType* ptr = (MyType*)0x1000 then I want to end up with ptr == (MyType*)0x1ABC.

Captain Obvlious Over a year ago

You should avoid using C-style casts in C++ for anything except perhaps numerical casts. The compiler will apply the first C++ cast that works except for dynamic_cast. One issue (among others) is that C-style casts can remove the constness of an object with no indication it's happening either in source or via a compiler diagnostic.

JBentley Over a year ago

How does this answer the question?

Collectives™ on Stack Overflow

Portable and safe way to add byte offset to any pointer

5 Answers 5

26 Comments

Comments

Comments

Comments

3 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

26 Comments

Comments

Comments

Comments

3 Comments

Linked

Related