1

I am writing a program to unpack PE files. I have a struct, Pe_SymbolHeader. It looks like this:

typedef struct _Pe_SymbolHeader { char Name[8]; // 8 uint32_t Value; // 12 uint16_t SectionNumber; // 14 uint16_t Type; // 16 uint8_t StorageClass; // 17 uint8_t NumberOfAuxSymbols; // 18 } Pe_SymbolHeader; 

gcc is telling me the size of this struct is 20 bytes.

printf("sizeof Pe_SymbolHeader %d\n", sizeof(Pe_SymbolHeader)); 

I decided to place a Pe_SymbolHeader on the stack and take a look at where everything was located in memory

Pe_SymbolHeader test printf("%p\n", &(test.Name)); printf("%p\n", &(test.Value)); printf("%p\n", &(test.SectionNumber)); printf("%p\n", &(test.Type)); printf("%p\n", &(test.StorageClass)); printf("%p\n", &(test.NumberOfAuxSymbols)); 

This gave me the following, which seems ok:

0x7fffffffe150 0x7fffffffe158 0x7fffffffe15c 0x7fffffffe15e 0x7fffffffe160 0x7fffffffe161 

So if gcc is using 18 bytes to store my struct, why is sizeof telling me the struct will take 20 bytes?

Edit: Ok, it seems what gcc is doing to try and help me is what is killing me, and several answers are correct. I can only vote for one, but thank you those who answered.

1
  • the real keywords are alignment and padding (<-- note) Commented Oct 2, 2011 at 21:27

4 Answers 4

3

Because of padding for alignment


There is padding at the end of the struct.

The reason has to do with what happens in an array or possibly some other context where something follows your struct. That something might be another instance of this struct. The struct contains a 32-bit object so it will have an alignment requirement of 32 bits.

The compiler very much wants the next item to start on a natural word boundary for the architecture it is compiling for, so that any field in the next object can be read with a single operation instead of two operations plus some fiddling to combine two different "words".

Sign up to request clarification or add additional context in comments.

Comments

3

The uint32_t part of the structure needs to be aligned on a multiple of 4 bytes, so the size of the structure has to be a multiple of 4 bytes to ensure that an array of the structure will not cause trouble (misaligned access trouble - which can lead to SIGBUS errors on some machines and to (very) inefficient access on (most) other machines). Therefore, the compiler mandates 2 padding bytes at the end of the structure; these have no name, so you cannot legitimately access them.

2 Comments

It is, isn't it? It's aligned after an 8-byte char array.
Consider an array of 2 items, each 18 bytes long. In the item with index 0, yes - the uint32_t would be on an 4-byte boundary. However, in the item with index 1, though, the offset would be a multiple of 2 bytes and not a multiple of 4, so that would fail...and rather than let that happen, GCC adds the padding bytes. Note that (char *)&array[i+1] == ((char *)&array[i] + sizeof(array[i])).
2

So if gcc is using 18 bytes to store my struct, why is sizeof telling me the struct will take 20 bytes?

Put 2 structs like this on the stack and print the same thing for both of them, this will enlighten you.

1 Comment

Yes, gcc aligns the structs to a boundary (in my specific case, a 4-byte boundary), but this does not change the size of the Pe_SymbolHeader struct. This is important when doing things like accessing a Pe_SymbolHeader in the PE file (sizeof(Pe_SymbolHeader * symbol_index))
1
uint32_t Value; 

this is adding 6 bytes as opposed to expected 4. I tend to agree with Jonathan Leffler about the underlying reasons.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.