12

Why is the size of an enum always 2 or 4 bytes (on a 16- or 32-bit architecture respectively), regardless of the number of enumerators in the type?

Does the compiler treat an enum like it does a union?

14
  • 2
    @GrijeshChauhan: True in C, not in C++ -- and in C, the size of an enum constant can differ from the size of the type. Commented Jul 26, 2013 at 18:21
  • 2
    What happens if you create 2,147,483,648 enums? I bet the compiler gets cranky.... Commented Jul 26, 2013 at 18:25
  • 3
    @Jim same is my doubt, one don't wants to create more then 100s symbolic constant. if he do probability he is bad programmer. Commented Jul 26, 2013 at 18:34
  • 7
    @Jim Just for grins, if you assume the typical uppercase alpha enum identifier, and stick to uniform lengths, each identifier would have to be 7 characters, plus a comma and space, which means the source file itself would be on the order of 20GB. I don't expect many compilers would be particularly happy trying to chew through that... Commented Jul 26, 2013 at 18:41
  • 1
    @LorenPechtel: enum too_big { big = INT_MAX, bigger }; Commented Jul 26, 2013 at 22:25

6 Answers 6

20

In both C and C++, the size of an enum type is implementation-defined, and is the same as the size of some integer type.

A common approach is to make all enum types the same size as int, simply because that's typically the type that makes for the most efficient access. Making it a single byte, for example, would save a very minor amount of space, but could require bigger and slower code to access it, depending on the CPU architecture.

In C, enumeration constants are by definition of type int. So given:

enum foo { zero, one, two }; enum foo obj; 

the expression zero is of type int, but obj is of type enum foo, which may or may not have the same size as int. Given that the constants are of type int, it tends to be easier to make the enumerated type the same size.

In C++, the rules are different; the constants are of the enumerated type. But again, it often makes the most sense for each enum type to be one "word", which is typically the size of int, for efficiency reasons.

And the 2011 ISO C++ standard added the ability to specify the underlying integer type for an enum type. For example, you can now write:

enum foo: unsigned char { zero, one, two }; 

which guarantees that both the type foo and the constants zero, one, and two have a size of 1 byte. C does not have this feature, and it's not supported by older pre-2011 C++ compilers (unless they provide it as a language extension).

(Digression follows.)

So what if you have an enumeration constant too big to fit in an int? You don't need 231, or even 215, distinct constants to do this:

#include <limits.h> enum huge { big = INT_MAX, bigger }; 

The value of big is INT_MAX, which is typically 231-1, but can be as small as 215-1 (32767). The value of bigger is implicitly big + 1.

In C++, this is ok; the compiler will simply choose an underlying type for huge that's big enough to hold the value INT_MAX + 1. (Assuming there is such a type; if int is 64 bits and there's no integer type bigger than that, that won't be possible.)

In C, since enumeration constants are of type int, the above is invalid. It violates the constraint stated in N1570 6.7.2.2p2:

The expression that defines the value of an enumeration constant shall be an integer constant expression that has a value representable as an int.

and so a compiler must reject it, or at least warn about it. gcc, for example, says:

error: overflow in enumeration values

Sign up to request clarification or add additional context in comments.

Comments

10

An enum is not a structure, it's just a way of giving names to a set of integers. The size of a variable with this type is just the size of the underlying integer type. This will be a type needed to hold the largest value in the enum. So as long as all the types fit in the same integer type, the size won't change.

2 Comments

I don't think the OP thought enums were structures. You might expect an enum with 256 elements to occupy 8 bits, and one with 257 elements to occupy, say, 16 bits. But compilers commonly make all enums the same size as an int, just because that makes for more efficient access.
@KeithThompson I read it that the OP was looking at enums as struct. In fact his comment mentions that he thought the compiler would treat the enum as a union.
8

The size of an enum is implementation-defined -- the compiler is allowed to choose whatever size it wants, as long as it's large enough to fit all of the values. Some compilers choose to use 4-byte enums for all enum types, while some compilers will choose the smallest type (e.g. 1, 2, or 4 bytes) which can fit the enum values. The C and C++ language standards allow both of these behaviors.

From C99 §6.7.2.2/4:

Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined,110) but shall be capable of representing the values of all the members of the enumeration.

From C++03 §7.2/5:

The underlying type of an enumeration is an integral type that can represent all the enumerator values defined in the enumeration. It is implementation-defined which integral type is used as the underlying type for an enumeration except that the underlying type shall not be larger than int unless the value of an enumerator cannot fit in an int or unsigned int. If the enumerator-list is empty, the underlying type is as if the enumeration had a single enumerator with value 0. The value of sizeof() applied to an enumeration type, an object of enumeration type, or an enumerator, is the value of sizeof() applied to the underlying type.

5 Comments

Good answer but one thing I can't understand why sizeof(enum) == sizeof(int) why not sizeof(char) One don't like to create 2^sizeof(int) symbolic constants
@GrijeshChauhan: sizeof(enum) is not necessarily equal to the sizeof(int). It could be sizeof(enum) == sizeof(some integral type) where some integral type depends on the values of the enum.
@GrijeshChauhan: If all of the enum values fit within the range of a char, then it is possible that you'll have sizeof(enum) == sizeof(char). There are no guarantees about that, however, it's completely up to the compiler to decide.
sizeof(enum) == sizeof(int) is only true for C. For C++ , its implementation defined.
@PHIfounder: It's not necessarily true for C either. In both C and C++, the underlying integer type for an enum type is implementation-defined; it can be char or unsigned char as long as all the constants are representable in that type. C enumeration constants are of type int, but the type itself is not.
3

It seems to me that the OP has assumed that an enum is some kind of collection which stores the values declared in it. This is incorrect.

An enumeration in C/C++ is simply a numeric variable with strictly defined value range. The names of the enum are kind of aliases for numbers.

The storage size is not influenced by the amount of the values in enumeration. The storage size is implementation defined, but mostly it is the sizeof(int).

Comments

1

The size of an enum is "an integral type at least large enough to contain any of the values specified in the declaration". Many compilers will just use an int (possibly unsigned), but some will use a char or short, depending on optimization or other factors. An enum with less than 128 possible values would fit in a char (256 for unsigned char), and you would have to have 32768 (or 65536) values to overflow a short, and either 2 or 4 billion values to outgrow an int on most modern systems.

An enum is essentially just a better way of defining a bunch of different constants. Instead of this:

#define FIRST 0 #define SECOND 1 ... 

you just:

enum myenum { FIRST, SECOND, ... }; 

It helps avoid assigning duplicate values by mistake, and removes your need to even care what the particular values are (unless you really need to).

3 Comments

I have a doubt, by defining FIRST 0, does it allocate any memory ?
@jhakash: No, since that doesn't create an object; it's just a compile-time definition. (BTW, the use of "doubt" to mean "question" tens to be specific to south Asian dialects of English; in US and UK English, "doubt" tends to imply disbelief. "I have a question" would be clearer.
@KeithThompson Oops. I didn't know about doubt thing also, thanks for this clarification also.
0

The big problem with making an enum type smaller than int when a smaller type could fit all the values is that it would make the ABI for a translation unit dependent on the number of enumeration constants. For instance, suppose you have a library that uses an enum type with 256 constants as part of its public interface, and the compiler chooses to represent the type as a single byte. Now suppose you add a new feature to the library and now need 257 constants. The compiler would have to switch to a new size/representation, and now all object files compiled for the old interface would be incompatible with your updated library; you would have to recompile everything to make it work again.

Thus, any sane implementation always uses int for enum types.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.