Memory alignment in C-structs

Question

I'm working on a 32-bit machine, so I suppose that the memory alignment should be 4 bytes. Say I have this struct:

typedef struct { unsigned short v1; unsigned short v2; unsigned short v3; } myStruct;

The plain added size is 6 bytes, and I suppose that the aligned size should be 8, but sizeof(myStruct) returns me 6.

However if I write:

typedef struct { unsigned short v1; unsigned short v2; unsigned short v3; int i; } myStruct;

the plain added size is 10 bytes, aligned size shall be 12, and this time sizeof(myStruct) == 12.

Can somebody explain what is the difference?

sizeof DOES include the padding, otherwise it would be useless for copy the struct - the whole point of sizeof is to tell you the actual size in memory — Martin Beckett
– Martin Beckett, Commented Mar 25, 2011 at 17:19
Does this answer your question? Why isn't sizeof for a struct equal to the sum of sizeof of each member? — RobertS supports Monica Cellio
– RobertS supports Monica Cellio, Commented Jun 29, 2020 at 14:39

HelloWorld123456789 · Accepted Answer · 2014-04-17 03:35:29Z

67

At least on most machines, a type is only ever aligned to a boundary as large as the type itself [Edit: you can't really demand any "more" alignment than that, because you have to be able to create arrays, and you can't insert padding into an array]. On your implementation, short is apparently 2 bytes, and int 4 bytes.

That means your first struct is aligned to a 2-byte boundary. Since all the members are 2 bytes apiece, no padding is inserted between them.

The second contains a 4-byte item, which gets aligned to a 4-byte boundary. Since it's preceded by 6 bytes, 2 bytes of padding is inserted between v3 and i, giving 6 bytes of data in the shorts, two bytes of padding, and 4 more bytes of data in the int for a total of 12.

edited Apr 17, 2014 at 3:35

HelloWorld123456789

5,3693 gold badges25 silver badges33 bronze badges

answered Mar 25, 2011 at 17:19

Jerry Coffin

494k83 gold badges656 silver badges1.2k bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Ivan Over a year ago

Thanks everyone. I choose this answer, cause it was first correct. So compiler aligns the struct to the size of the largest element.

Mark B Over a year ago

@Ivan technically it aligns to the largest alignment of any item in the struct (which is subtly different than what you said).

Vitalik Verhovodov Over a year ago

How does it behave in arrays of such structs? Have we proportional memory loss here?

Jerry Coffin Over a year ago

@VitalikVerhovodov: An array can't have padding between items, so an array of N items always has a size exactly N * the size of a single item.

Peter - Reinstate Monica Over a year ago

@Ivan You cannot make general assumptions (independent of CPU/compiler/options) about member alignment. There may or may not be padding, and if there is, it may or may not be to the byte boundary corresponding to the member size. A compiler is free to pad more than the type size in structs even if it cannot do so in arrays. (I don't know whether that happens anywhere in reality, but on a 64 bit machine it is not inconceivable that 64 bit aligned access may be the fastest, so a struct of 4 chars may have a size of 32 bytes -- as would have a struct of 4 shorts or 4 ints.)

|

Evandro Coan · Accepted Answer · 2017-03-03 05:03:30Z

Forget about having different members, even if you write two structs whose members are exactly same, with a difference is that the order in which they're declared is different, then size of each struct can be (and often is) different.

For example, see this,

#include <iostream> using namespace std; struct A { char c; char d; int i; }; struct B { char c; int i; //note the order is different! char d; }; int main() { cout << sizeof(A) << endl; cout << sizeof(B) << endl; }

Compile it with gcc-4.3.4, and you get this output:

8 12

That is, sizes are different even though both structs has same members!

Code at Ideone : http://ideone.com/HGGVl

The bottomline is that the Standard doesn't talk about how padding should be done, and so the compilers are free to make any decision and you cannot assume all compilers make the same decision.

Hm, is this splitting hairs now? Event with gcc4.3.4, you can get different results on another platform (we used gcc on a TI DSP quite a while ago where sizeof(char) == sizeof(int) and CHAR_BIT == 16. Sure, there you could have played the same game with int and long...).
"Compile it with gcc-4.3.4, and you get this output:" is indeed completely wrong and confusing for novices. You cannot assume that every implementation has the same alignment requirements just because using the same compiler. Your concern about that order is important too is excellent but the impression that a compiler always gives the same result is inappropriate and wrong -1 until you change it.

RichieHindle · Accepted Answer · 2011-03-25 17:20:25Z

By default, values are aligned according to their size. So a 2-byte value like a short is aligned on a 2-byte boundary, and a 4-byte value like an int is aligned on a 4-byte boundary

In your example, 2 bytes of padding are added before i to ensure that i falls on a 4-byte boundary.

(The entire structure is aligned on a boundary at least as big as the biggest value in the structure, so your structure will be aligned to a 4-byte boundary.)

The actual rules vary according to the platform - the Wikipedia page on Data structure alignment has more details.

Compilers typically let you control the packing via (for example) #pragma pack directives.

I would like to understand why a short is aligned by a 2-bytes boundary. From an optimization point of view, with avoiding a datum ocuppies two differente 4-bytes words is enough. For example, why not a structure like that: struct C { int i:8; char c; int j:8; };?
@Peregring-lk You probably shall (if not done yet) dedicate a separate question to this.

RobertS supports Monica Cellio · Accepted Answer · 2020-06-29 13:44:04Z

Assuming:

sizeof(unsigned short) == 2 sizeof(int) == 4

Then I personally would use the following (your compiler may differ):

unsigned shorts are aligned to 2 byte boundaries int will be aligned to 4 byte boundaries. typedef struct { unsigned short v1; // 0 bytes offset unsigned short v2; // 2 bytes offset unsigned short v3; // 4 bytes offset } myStruct; // End 6 bytes. // No part is required to align tighter than 2 bytes. // So whole structure can be 2 byte aligned. typedef struct { unsigned short v1; // 0 bytes offset unsigned short v2; // 2 bytes offset unsigned short v3; // 4 bytes offset /// Padding // 6-7 padding (so i is 4 byte aligned) int i; // 8 bytes offset } myStruct; // End 12 bytes // Whole structure needs to be 4 byte aligned. // So that i is correctly aligned.

Abhay Buch · Accepted Answer · 2011-03-25 18:41:05Z

Firstly, while the specifics of padding are left up to the compiler, the OS also imposes some rules as to alignment requirements. This answer assumes that you are using gcc, though the OS may vary

To determine the space occupied by a given struct and its elements, you can follow these rules:

First, assume that the struct always starts at an address that is properly aligned for all data types.

Then for every entry in the struct:

The minimum space needed is the raw size of the element given by sizeof(element).
The alignment requirement of the element is the alignment requirement of the element's base type. Notably, this means that the alignment requirement for a char[20] array is the same as the requirement for a plain char.

Finally, the alignment requirement of the struct as a whole is the maximum of the alignment requirements of each of its elements.

gcc will insert padding after a given element to ensure that the next one (or the struct if we are talking about the last element) is correctly aligned. It will never rearrange the order of the elements in the struct, even if that will save memory.

Now the alignment requirements themselves are also a bit odd.

32-bit Linux requires that 2-byte data types have 2-byte alignment (their addresses must be even). All larger data types must have 4-byte alignment (addresses ending in 0x0, 0x4, 0x8 or 0xC). Note that this applies to types larger than 4 bytes as well (such as double and long double).
32-bit Windows is more strict in that if a type is K bytes in size, it must be K byte aligned. This means that a double can only placed at an address ending in 0x0 or 0x8. The only exception to this is the long double which is still 4-byte aligned even though it is actually 12-bytes long.
For both Linux and Windows, on 64-bit machines, a K byte type must be K byte aligned. Again, the long double is an exception and must be 16-byte aligned.

Jonathan · Accepted Answer · 2011-03-25 17:18:15Z

4

Each data type needs to be aligned on a memory boundary of its own size. So a short needs to be on aligned on a 2-byte boundary, and an int needs to be on a 4-byte boundary. Similarly, a long long would need to be on an 8-byte boundary.

answered Mar 25, 2011 at 17:18

Jonathan

13.7k4 gold badges38 silver badges34 bronze badges

6 Comments

fazo Over a year ago

no, it doesn't need to - it's just compiled like that to produce faster code (look at pragma packed in gcc)

Jonathan Over a year ago

@fazo Technically the compiler can generate code to access unaligned memory, but it requires multiple load instructions and some combination instructions, as the processor will not allow you to load unaligned data in one operation. That is why you need to explicitly tell it to not pad. Your best option is to put the largest types first, so no padding is necessary.

Jerry Coffin Over a year ago

@Johathan: that depends on the processor. An x86 (for one obvious example) can handle mis-aligned accesses entirely in hardware -- all of the multiple reads (if necessary) and assembly is handled in the hardware.

Mark B Over a year ago

And on the other end of the scale, hardware like SPARC will generate a bus error if you access mis-aligned variables.

Jonathan Over a year ago

So basically, it's best not to do it :-P

|

NPE · Accepted Answer · 2011-03-25 17:17:33Z

2

The reason for the second sizeof(myStruct) being 12 is the padding that gets inserted between v3 and i to align i at a 32-bit boundary. There is two bytes of it.

Wikipedia explains the padding and alignment reasonably clearly.

answered Mar 25, 2011 at 17:17

NPE

503k114 gold badges970 silver badges1k bronze badges

4 Comments

Hossein Over a year ago

So you mean sizeof calculates the padding between the vars, but not the alignment bytes at the end/beginning of the struct?

Ivan Over a year ago

yes, I suppose that it has to return me alignment bytes at the end/beginning

NPE Over a year ago

@Ivan Your first struct is aligned. The alignment there is on a 2-byte boundary, since that's the size of the largest element of the struct. The second struct is aligned on a 4-byte boundary (because of the int). Neither needs any padding at the beginning or end.

Lindydancer Over a year ago

@Hossein sizeof includes both the padding inside the struct and at the end. The size myStruct is 6 bytes since there are no padding at the end of the struct. The reason for this is that you don't need it since it only contains members that require 2-byte alignment.

Mark B · Accepted Answer · 2011-03-25 17:20:00Z

In your first struct, since every item is of size short, the whole struct can be aligned on short boundaries, so it doesn't need to add any padding at the end.

In the second struct, the int (presumably 32 bits) needs to be word aligned so it inserts padding between v3 and i to align i.

evandrix · Accepted Answer · 2021-11-16 00:48:17Z

The standard doesn't say much about the layout of structs with complete types - it's up to to the compiler. It decided that it needs the int to start on a boundary to access it, but since it has to do sub-boundary memory addressing for the shorts there is no need to pad them

Necrolis · Accepted Answer · 2011-03-25 17:21:05Z

Sounds like its being aligned to bounderies based on the size of each var, so that the address is a multiple of the size being accessed(so shorts are aligned to 2, ints aligned to 4 etc), if you moved one of the shorts after the int, sizeof(mystruct) should be 10. Of course this all depends on the compiler being used and what settings its using in turn.

Collectives™ on Stack Overflow

Memory alignment in C-structs

10 Answers 10

7 Comments

2 Comments

2 Comments

Comments

Comments

6 Comments

4 Comments

Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

7 Comments

2 Comments

2 Comments

Comments

Comments

6 Comments

4 Comments

Comments

1 Comment

Comments

Linked

Related