5. Common pitfalls when using arrays.
5.1 Pitfall: Trusting type-unsafe linking.
OK, you’ve been told, or have found out yourself, that globals (namespace scope variables that can be accessed outside the translation unit) are Evil™. But did you know how truly Evil™ they are? Consider the program below, consisting of two files [main.cpp] and [numbers.cpp]:
// [main.cpp] #include <iostream> extern int* numbers; int main() { using namespace std; for( int i = 0; i < 42; ++i ) { cout << (i > 0? ", " : "") << numbers[i]; } cout << endl; }
// [numbers.cpp] int numbers[42] = {1, 2, 3, 4, 5, 6, 7, 8, 9};
In Windows 7 this compiles and links fine with both MinGW g++ 4.4.1 and Visual C++ 10.0.
Since the types don't match, the program crashes when you run it.

In-the-formal explanation: the program has Undefined Behavior (UB), and instead of crashing it can therefore just hang, or perhaps do nothing, or it can send threating e-mails to the presidents of the USA, Russia, India, China and Switzerland, and make Nasal Daemons fly out of your nose.
In-practice explanation: in main.cpp the array is treated as a pointer, placed at the same address as the array. For 32-bit executable this means that the first int value in the array, is treated as a pointer. I.e., in main.cpp the numbers variable contains, or appears to contain, (int*)1. This causes the program to access memory down at very bottom of the address space, which is conventionally reserved and trap-causing. Result: you get a crash.
The compilers are fully within their rights to not diagnose this error, because C++11 §3.5/10 says, about the requirement of compatible types for the declarations,
[N3290 §3.5/10]
A violation of this rule on type identity does not require a diagnostic.
The same paragraph details the variation that is allowed:
… declarations for an array object can specify array types that differ by the presence or absence of a major array bound (8.3.4).
This allowed variation does not include declaring a name as an array in one translation unit, and as a pointer in another translation unit.
5.2 Pitfall: Doing premature optimization (memset & friends).
Not written yet
5.3 Pitfall: Using the C idiom to get number of elements.
With deep C experience it’s natural to write …
#define N_ITEMS( array ) (sizeof( array )/sizeof( array[0] ))
Since an array decays to pointer to first element where needed, the expression sizeof(a)/sizeof(a[0]) can also be written as sizeof(a)/sizeof(*a). It means the same, and no matter how it’s written it is the C idiom for finding the number elements of array.
Main pitfall: the C idiom is not typesafe. For example, the code …
#include <stdio.h> #define N_ITEMS( array ) (sizeof( array )/sizeof( *array )) void display( int const a[7] ) { int const n = N_ITEMS( a ); // Oops. printf( "%d elements.\n", n ); } int main() { int const moohaha[] = {1, 2, 3, 4, 5, 6, 7}; printf( "%d elements, calling display...\n", N_ITEMS( moohaha ) ); display( moohaha ); }
passes a pointer to N_ITEMS, and therefore most likely produces a wrong result. Compiled as a 32-bit executable in Windows 7 it produces …
7 elements, calling display...
1 elements.
- The compiler rewrites
int const a[7] to just int const a[]. - The compiler rewrites
int const a[] to int const* a. N_ITEMS is therefore invoked with a pointer. - For a 32-bit executable
sizeof(array) (size of a pointer) is then 4. sizeof(*array) is equivalent to sizeof(int), which for a 32-bit executable is also 4.
In order to detect this error at run time you can do …
#include <assert.h> #include <typeinfo> #define N_ITEMS( array ) ( \ assert(( \ "N_ITEMS requires an actual array as argument", \ typeid( array ) != typeid( &*array ) \ )), \ sizeof( array )/sizeof( *array ) \ )
7 elements, calling display...
Assertion failed: ( "N_ITEMS requires an actual array as argument", typeid( a ) != typeid( &*a ) ), file runtime_detect ion.cpp, line 16
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
The runtime error detection is better than no detection, but it wastes a little processor time, and perhaps much more programmer time. Better with detection at compile time! And if you're happy to not support arrays of local types with C++98, then you can do that:
#include <stddef.h> typedef ptrdiff_t Size; template< class Type, Size n > Size n_items( Type (&)[n] ) { return n; } #define N_ITEMS( array ) n_items( array )
Compiling this definition substituted into the first complete program, with g++, I got …
M:\count> g++ compile_time_detection.cpp
compile_time_detection.cpp: In function 'void display(const int*)':
compile_time_detection.cpp:14: error: no matching function for call to 'n_items(const int*&)'
M:\count> _
How it works: the array is passed by reference to n_items, and so it does not decay to pointer to first element, and the function can just return the number of elements specified by the type.
With C++11 you can use this also for arrays of local type, and it's the type safe C++ idiom for finding the number of elements of an array.
5.4 C++11 - C++20 pitfall: Using a constexpr array size function.
With C++11 and later, it's natural to implement an array size function as follows:
// Similar in C++03, but not constexpr. template< class Type, std::size_t N > constexpr std::size_t size( Type (&)[N] ) { return N; }
This yields the amount of elements in an array as a compile time constant. This function has even been standardized as std::size in C++17.
For example, size() can be used to declare an array of the same size as another:
// Example 1 void foo() { int const x[] = {3, 1, 4, 1, 5, 9, 2, 6, 5, 4}; int y[ size(x) ] = {}; }
But consider this code using the constexpr version:
// Example 2 template< class Collection > void foo( Collection const& c ) { constexpr int n = size( c ); // error prior to C++23 // ... } int main() { int x[42]; foo( x ); }
The pitfall: until C++23 using the reference c n a constant expression is not allowed, and all major compilers reject this code. From the C++20 standard, [expr.const] p5.12:
An expression E is a core constant expression unless the evaluation of E, following the rules of the abstract machine, would evaluate one of the following:
- [...]
- an id-expression that refers to a variable or data member of reference type unless the reference has a preceding initialization and either
- it is usable in constant expressions or
- its lifetime began within the evaluation of E;
c is neither usable in a constant expression nor did its lifetime begin within constexpr int n = ..., so evaluating c is not a core constant expression. These restrictions have been lifted for C++23 by P2280: Using unknown pointers and references in constant expressions. c is treated a reference binding to an unspecified object ([expr.const] p8).
5.4.1 Workaround: C++20-compatible constexpr size function
std::extent< decltype( c ) >::value; is not a viable workaround because it would fail if Collection was not an array.
To deal with collections that can be non-arrays one needs the overloadability of an size function, but also, for compile time use one needs a compile time representation of the array size. And the classic C++03 solution, which works fine also in C++11 and C++14, is to let the function report its result not as a value but via its function result type. For example like this:
// Example 3 - OK (not ideal, but portable and safe) #include <array> #include <cstddef> // No implementation, these functions are never evaluated. template< class Type, std::size_t N > auto static_n_items( Type (&)[N] ) -> char(&)[N]; // return a reference to an array of N chars template< class Type, std::size_t N > auto static_n_items( std::array<Type, N> const& ) -> char(&)[N]; #define STATIC_N_ITEMS( c ) ( sizeof( static_n_items( c )) ) template< class Collection > void foo( Collection const& c ) { constexpr std::size_t n = STATIC_N_ITEMS( c ); // ... } int main() { int x[42]; std::array<int, 43> y; foo( x ); foo( y ); }
About the choice of return type for static_n_items: this code doesn't use std::integral_constant because with std::integral_constant the result is represented directly as a constexpr value, reintroducing the original problem.
About the naming: part of this solution to the constexpr-invalid-due-to-reference problem is to make the choice of compile time constant explicit.
Until C++23, a macro like the STATIC_N_ITEMS above yields portability, e.g. to the clang and Visual C++ compilers, retaining type safety.
Related: macros do not respect scopes, so to avoid name collisions it can be a good idea to use a name prefix, e.g. MYLIB_STATIC_N_ITEMS.
std::arrays,std::vectors andgsl::spans - I would frankly expect an FAQ on how to use arrays in C++ to say "By now, you can start considering just, well, not using them."