Not an answer, but some extended comments.
Sparse matrix-sparse matrix multiplication
MathematicaLover made the very good observation that--as written by OP--Mathematica performs sparse matrix-sparse matrix multiplies. In fact, Armadillo does it, too, it seems to be about twice as fast in that. I'd like to explain why I don't think thatwas quite surprised by the great margin by which Armadillo is better in this respectwas so much faster. Now I thinkbelieve that theI have an explanation for that: The compiler is probably clever enough to optimize in the following spirit:
First@AbsoluteTiming[ H10slow = H.H.H.H.H.H.H.H.H.H; Total[H10slow.f1]; ] First@AbsoluteTiming[ H2 = H.H; H4 = H2.H2; H10 = H2.H4.H4; Total[H10.f1]; ] H10 == H10slow 1.27093
0.440999
True
Compared to this, the Armadillo function takes 0.631295 on my machine ( which is an Intel Haswell, so an architecture for which the MKL is highly optimized).
SoHence Armadillo does not seem to be really better than Mathematica+MKL in this benchmark; it is the C++ compiler that made the difference (because it is able to do what I did by hand).
Sparse matrix-vector multiplication
So, let's also compare Armadillo's sparse matrix-vector multiplication to Mathematica's (or rather that of MKL). To that end I compiled the function ArmilloDot with
F3 = H*H*H*H*H*H*H*H*H*H*f2; replaced by
F3 = H*(H*(H*(H*(H*(H*(H*(H*(H*(H*f2))))))))); Then I ran
RepeatedTiming[Total[arma@"ArmadilloDot"[H, f1]]] RepeatedTiming[Total[Nest[H.# &, f1, 10]]] RepeatedTiming[Total[MatrixPower[H, 10, f1]]] and got these as results:
{0.0148027, -9.3592*10^6}
{0.00386909, -9.3592*10^6}
{0.00334479, -9.3592*10^6}
So the Armadillo variant is actually awfully slow. But to be fair, the difference probably comes from CSR to CSC conversion/transposition of the sparse matrix. Transposing a sparse matrix is actually less quick than one would expect it to be.!
Disclaimer
Once again, all timing results are strongly hardware and software dependent. I only tried to point out that performance tests have to be made thoroughly. And that MKL is a really good math library --- if you use an Intel CPU. It is however notorious for nerfing AMD CPUs.