Not making large arrays static, even when they're constexpr can have dramatic performance impact and can lead to many missed optimizations. It may slow down your code by orders of magnitude. Your variables are still local and the compiler may decide to initialize them at runtime instead of storing them as data in the executable.
Consider the following example:
template <int N> void foo(); void bar(int n) { // array of four function pointers to void(void) constexpr void(*table[])(void) { &foo<0>, &foo<1>, &foo<2>, &foo<3> }; // look up function pointer and call it table[n](); }
You probably expect gcc-10 -O3 to compile bar() to a jmp to an address which it fetches from a table, but that is not what happens:
bar(int): mov eax, OFFSET FLAT:_Z3fooILi0EEvv movsx rdi, edi movq xmm0, rax mov eax, OFFSET FLAT:_Z3fooILi2EEvv movhps xmm0, QWORD PTR .LC0[rip] movaps XMMWORD PTR [rsp-40], xmm0 movq xmm0, rax movhps xmm0, QWORD PTR .LC1[rip] movaps XMMWORD PTR [rsp-24], xmm0 jmp [QWORD PTR [rsp-40+rdi*8]] .LC0: .quad void foo<1>() .LC1: .quad void foo<3>()
This is because GCC decides not to store table in the executable's data section, but instead initializes a local variable with its contents every time the function runs. In fact, if we remove constexpr here, the compiled binary is 100% identical.
This can easily be 10x slower than the following code:
template <int N> void foo(); void bar(int n) { static constexpr void(*table[])(void) { &foo<0>, &foo<1>, &foo<2>, &foo<3> }; table[n](); }
Our only change is that we have made table static, but the impact is enormous:
bar(int): movsx rdi, edi jmp [QWORD PTR bar(int)::table[0+rdi*8]] bar(int)::table: .quad void foo<0>() .quad void foo<1>() .quad void foo<2>() .quad void foo<3>()
In conclusion, never make your lookup tables local variables, even if they're constexpr. Clang actually optimizes such lookup tables well, but other compilers don't. See Compiler Explorer for a live example.