Here are some things that may help you improve your code.
Prefer iostreams to printf
The printf function needs to evaluate the format string at runtime and then apply the result, while something like std::cout << n; already knows at compile time what kind of thing that n is. The result is that the C++ version can be faster. It's a common misconception that C++ iostream operations are slower than printf (measuring is the way to go!) but in this case, it's only used once, so it probably doesn't matter.
Prefer constexpr to const
In modern C++, constexpr implies that the value can be derived at compile time, so anything that can be constexpr reduces to essentially zero runtime overhead. That's a huge "win" for performance, so it should be something you strive for.
Understand integer promotion rules
The effect of declaring short everywhere its used in this program is ... absolutely nothing! The reason is that there are integral numeric promotions being applied to promote from a short to an int.
Use objects
Since you're writing in C++, let's use objects! In this case, we might wrap everything up into a Grid object and have a const function that returns the answer. My rewrite on a 64-bit Linux machine has this for main:
int main() { Grid<22> g; std::cout << g.answer(20) << '\n'; }
Reduce memory use by choosing a better data structure
If we think about the algorithm carefully, it's possible to do everything in a single array of length n rather than using an n x n grid. Here's one way to do it in the constructor of the Grid class:
Grid() { for (unsigned i = 0; i < n; ++i) { p[i] = i + 2; } for (unsigned i = 1; i < n; ++i) { p[i] *= 2; for (unsigned j = i+1; j < n; ++j) { p[j] += p[j-1]; } } }
This is essentially the same algorithm, but reduced to a single in-place array. More on that in the next suggestion.
Prefer std::array to plain C-style arrays
The std::array is just as efficient as a plain C-style array, but has the considerable advantage that it can be used with all of the standard library algorithms. In this case, it seems prudent to make a templated class, so the first few lines of the Grid class are these:
template <unsigned n> class Grid { std::array<unsigned long long,n> p; // rest of the class };
Carefully consider usage
Note that as the array is constructed, each successive entry in the final version of the array is the answer for that size grid. So we can use that by writing the answer function like so:
unsigned long long answer(int i) const { return p.at(i-1); }
The at function will throw an exception if the index is out of bounds, so that that's one less consideration that we need to write code to handle (except perhaps at the caller's end).
Further optimizations
With this same basic algorithm, we can move almost all of the calculation to compile-time. The way we can do that is to use constexpr and recursive templates to calculate values for all sizes at compile time and then the entire program reduces, effectively, to a runtime lookup in a static table. I'll leave this final optimization step to you, but this compile-time sieve of Eratosthenes might inspire you and give you ideas on how to do that.