Efficient way to do ( or Matrix free) Arnoldi

Question

I would like to use a polynomial filtering scheme for finding the eigenvalues around a target angle for a given unitary matrix U. The scheme can be found in this paper. To reiterate it we do a polynomial transformation of a given matrix $U$ via $g(U)= \sum_{m=0}^{k} e^{-i m \phi} U^{m}$. Here $\phi$ is the target angle around which I want to find the eigenvalues, and $k$ is some optimally chosen number to have good convergence. It should be noted that conventionally the same task is done using a shift-invert type method. Shift-invert is easier to use in Mathematica as well, and I have no problem with it. But since it is desirable to go to higher system sizes, the memory requirements of the shift-invert blow up pretty fast, and polynomial filtering is an efficient way to do such filtering.

I was trying to implement the same in Mathematica, and this is where the problem come. I did the very naive brute force thing that can be done, which is I evaluated the polynomial and then used Mathematica to do Arnoldi, and this time we do not have to do the shift as that part has already been taken care of. But there is an obvious apparent problem with this: the method is not "matrix-free". Well, I do not have to do that part as ARPACK package already does the job for me. But I am kind of clueless if there is even a way to use ARPACK as an external package in Mathematica and do this whole computation in a matrix-free manner. I tried to see further into the matter, but at least to my knowledge, it seems that Mathematica does not offer matrix-free computations at the moment. But is it still possible for me to use ARPACK (which is used in hindsight) in mathematica such that I give it $g(U)$ and the rest would be taken care of by it, and then I can extract the final values? Any help or suggestion in this regard would be really helpful. Thanks.

Edit1:

 ^(\[CapitalNu]=11; nev = 50; U = RandomVariate[CircularUnitaryMatrixDistribution[2^(\[CapitalNu]]]; phitg = Pi/2; (*Target phase*) k = 0.8 2^(\[CapitalNu] + 1)/Floor[2^(\[CapitalNu]/2 + 1)];

So I take a simple unitary matrix from a COE ensemble. I understand it is not an optimal choice ( one would need sparse matrix to really see the efficiency but for the time being it does the job). Then I define a function to calculate the matrix polynomial as stated before

gk[k_,\[Phi]_,x_]:=Sum[Exp[-I m \[Phi]] MatrixPower[x,m],{m,0,k}]

Now I find the polynomially filtered $U$

Upoly = gkm[k, \[Pi]/2, U];

Now if you calculate the eigenvalues of "maximum absolute value" of $Upoly$ then they should match with the eigenvalues for $U$ at $\phi= \pi/2$. For $U$ you can simply do

Eigenvalues[U, 10, Method -> {"Arnoldi", "Shift" -> Exp[I \[Pi]/2]}]

while for $Upoly$ it is to be noticed that the eigenvalues are now transformed ( the eigenvectors remains the same). The eigenvalues are simply given by $\psi_{i}^{\dagger}. U.\psi_{i}$. You can do that in mathematica as follows

num = 10; {eval, evec} = Eigensystem[Upoly2, num, Method -> {"Arnoldi", "Criteria" -> "Magnitude"}]; ParallelTable[Conjugate[evec[[i]]] . U . evec[[i]], {i, 1, num}] // Sort)

One can easily check that both of them are the same. But I noticed that the current way is completely useless because I am storing a large matrix, which I obtain after polynomial transformation. It completely beats the purpose for which this method is made. The only way is to do a matrix-free evaluation directly using ARPACK package. It is easier to do with python as I can see ( the paper also use the same) but I would like to know/see/understand if it is doable in Mathematica or not.

Edit 2: I have been thinking about it for a while. I already have a code (where I have to explicitly store the dense matrix). Is there still a way to do better than what I already have? So I verified all day today that with this code I can go as far as $2^{14} \times 2^{14} $ matrix size. I tried with $2^{15} \times 2^{15}$ it is still possible to do shift-invert, but my RAM usage just went crazy (I was on a workstation with 250GB RAM), so I quit it before it hanged (the RAM usage went upto 165 GB to calculate 100 eigenvalues). As I realized, the point of polynomial filtering is to avoid this shift invert part and just calculate $g(U)$ and then calculate eigenvalues with the largest modulus (this is what transformation $g(U)$ achieves). Can anyone suggest an optimal way then to calculate $g(U)$? For now, I do it the brute force way, as in the above code. Maybe another piece of information that would be important is that finally when I am going to use this, I will have $U$ of the form $U = e^{- i H_{2}} e^{-i H_{1}}$, where $H_{2}$ is a diagonal matrix and $H_{2}$ is highly sparse matrix for some many-body system. Does this help in any way to calculate the polynomial faster?

Please edit the question and include Mathematica code that you've used. — Domen
– Domen, Commented Feb 25 at 12:57
Thanks added. I hope this helps, although it is really bad and utterly useless for any practical purposes, as the inbuilt shift-invert already does a good job. But if one can do the matrix-free evaluation then it would be faster or so is the hope ( from what I read in the paper). — Erosannin
– Erosannin, Commented Feb 25 at 13:13
Yes, but it is of no help. There seems to be no apparent way to use "matrix-free" manipulations in Mathematica. So I tried to go one step further and use ARPACK directly in Mathematica if it is possible. — Erosannin
– Erosannin, Commented Feb 25 at 14:28
It looks like you are using ordinary Power in the definition of gk[] in a place where I believe you actually want MatrixPower. — Daniel Lichtblau
– Daniel Lichtblau, Commented Feb 27 at 3:47

Henrik Schumacher · Accepted Answer · 2025-03-04 02:22:15Z

The Arnoldi eigenvalue algorithm starts by creating what I call an Arnoldi factorization: We start with a random unit vector $v$ (or an initial guess for the eigenvector we would like to compute) and compute the vectors $$ v, A \, v, A^2 \, v, A^3 \, v,\dotsc,A^{m-1} \, v. $$ for some not too large $m \leq n$. The span of these vectors is called the Krylov space of order $m$.

Then we orthonormalize these vectors with the Gram-Schmidt orthogonalization method to get an orthonormal system $$ v_1 = v, \, v_2, \, v_3, \dotsc, v_m. $$ As we go, we can compute an $m \times m$ matrix $H$ such that $$ A \, v_k = \sum_{l = 1}^m v_l \, H_{lk} , \quad \text{for} \quad k = 1, \dotsc,m-1. $$ The matrix $H$ happens to be a Hessenberg matrix, which is good for many things. For us it is most important that the eigensystem of $H$ can be computed if $m$ is not too large.

The last vector's image $A \, v_m$ likely does not lie in the Krylov space of order $m$. Its orthogonal projection $r$ to the orthogonal complement of the Krylov space is called the residual vector. It is somewhat a measure for the quality of the Krylov space. If $r = 0$, then all the $A$ maps the Krylov space to itself.

The following is an implementation of this algorithm that computes $V = (v_1,\dotsc,v_m)^T$, $H$ and $r$ for a given maximal basis size "MaxBasisSize". It also might stop early when it finds that the residual $r$ is small enough.

ToPack = Developer`ToPackedArray; PackedQ = Developer`PackedArrayQ; PackedArrayType[A_] := None; PackedArrayType[A_?Developer`PackedArrayQ] := StringSplit[ToString[Developer`PackedArrayForm[A]], {"[", ","}][[2]] ClearAll[ArnoldiFactorization]; Options[Arnoldi] = { "MaxBasisSize" -> 200, "Tolerance" -> 1. 10^-10, "OrthogonalizationTolerance" -> 1. 10^-12, "MaxOrthogonalizationIterations" -> 5 }; Arnoldi[A_Function, r0_?(VectorQ[#] && PackedQ[#] &), OptionsPattern[]] := Module[{H, V, k, r, v, w, h, \[Beta], sTOL, TOL, m, n, s, siter, smaxiter, snorm, innerprod}, TOL = OptionValue["Tolerance"]; n = Length[r0]; m = Min[n, OptionValue["MaxBasisSize"]]; sTOL = OptionValue["OrthogonalizationTolerance"]; smaxiter = OptionValue["MaxOrthogonalizationIterations"]; innerprod = {v, u} |-> Conjugate[v . Conjugate[u]]; v = N[r0/Sqrt[innerprod[r0, r0]]]; w = A[v]; If[TrueQ[PackedArrayType[v] == Complex] || TrueQ[PackedArrayType[w] === Complex], H = ConstantArray[0. + 0. I, {m, m}]; V = ConstantArray[0. + 0. I, {m, n}]; , H = ConstantArray[0., {m, m}]; V = ConstantArray[0., {m, n}]; ]; V[[1]] = v; k = 1; h = innerprod[V[[1]], w]; H[[1, 1]] = h; r = w - h V[[1]]; \[Beta] = Sqrt[innerprod[r, r]]; While[\[Beta] > TOL && k < m, ++k; H[[k, k - 1]] = \[Beta]; v = r/\[Beta]; V[[k]] = v; w = A[v]; With[{Vk = V[[1 ;; k]]}, h = innerprod[Vk, w]; r = w - h . Vk; s = innerprod[Vk, r]; snorm = Sqrt[innerprod[s, s]]; siter = 0; While[(snorm > sTOL) && (siter < smaxiter), siter++; r -= s . Vk; h += s; s = innerprod[Vk, r]; snorm = Sqrt[innerprod[s, s]]; ]; H[[1 ;; k, k]] = h; ]; \[Beta] = Sqrt[innerprod[r, r]]; ]; If[k < m, {V[[1 ;; k]], H[[1 ;; k, 1 ;; k]], r}, {V, H, r} ] ]

Here is usage example:

n = 2048 A = RandomComplex[{-1 - I, 1 + I}, {n, n}]; DotA = v |-> Dot[A, v]; KrylovDimension = 256; v = RandomComplex[{-1 - I, 1 + I}, n]; {V, H, r} = Arnoldi[DotA, v, "MaxBasisSize" -> KrylovDimension];

Then the idea is that the eigenvectors to the greatest eigenvalues (greatest by absolute value) are likely very close to the Krylov space. Since $H$ encodes what happens within the Krylov space, the eigenvalues of $H$ should serve as decent approximation -- at least the largest ones.

{\[Lambda]Arnoldi, eH} = Eigensystem[H]; eArnoldi = Normalize /@ (eH . V); nEigenpairs = 64; \[Lambda]Arnoldi = \[Lambda]Arnoldi[[;; nEigenpairs]]; eArnoldi = eArnoldi[[;; nEigenpairs]];

Let's have a look ad the residuals of the eigenvalue problem:

ListLinePlot[ MapThread[ Norm[DotA[#2] - #1 #2]/Norm[#1 #2] &, {\[Lambda]Arnoldi, eArnoldi}] , ScalingFunctions -> "Log" , PlotLabel -> "Relative residuals of eigenpairs" ]

Not perfect, but at least the biggest eigenvalues (the ones on the left) are captured quite well.

There are certainly more things to be done. E.g., we can restart the Arnoldi iterations to get Krylov spaces more adapted to specific eigenpairs. And we can apply shifts and approximate matrix inverse to get also the lower eigenvalues approximated well. And indeed, even for the largest eigenvalues ARPACK must do something on top because it takes way longer and produces way better results:

{\[Lambda]MMA, eMMA} = Eigensystem[A, nEigenpairs, Method -> {"Arnoldi", "BasisSize" -> KrylovDimension} ]; ListLinePlot[ MapThread[Norm[DotA[#2] - #1 #2] &, {\[Lambda]MMA, eMMA}] , ScalingFunctions -> "Log" , PlotLabel -> "Absolute residuals of eigenpairs" ]

Mimicking ARPACK's complicated strategy here would be a bit too hard. (I would have to do some research first.) But we can try to refine the $k$-th eigenpair by simply restarting Arnoldi on it:

k = 10; {Vk, Hk, rk} = Arnoldi[DotA, eArnoldi[[k]], "MaxBasisSize" -> KrylovDimension]; {\[Lambda]k, ek} = Eigensystem[Hk]; ek = Normalize /@ (ek . Vk); \[Lambda]k = \[Lambda]k[[1 ;; nEigenpairs]]; ek = ek[[1 ;; nEigenpairs]]; ListLinePlot[ MapThread[Norm[DotA[#2] - #1 #2]/Norm[#1 #2] &, {\[Lambda]k, ek}] , ScalingFunctions -> "Log" , PlotLabel -> "Relative residuals of eigenpairs" ]

Thank you so much for your time and great answer. It exactly is what I needed. — Erosannin
– Erosannin, Commented Mar 3 at 22:21
Glad to be of help. The lack of a matrix-free Arnoldi API has come up sevaral times on this site. And I was curious myself how difficult this would be. Furtunately I already had some code for the Arnoldi function; I had written it as a prototype for the GMRES solver in my package Tensors. — Henrik Schumacher
– Henrik Schumacher, Commented Mar 3 at 22:30
I just started checking it. It seems the Arnoldi function should instead be ArnoldiBasis? Also eArnoldi[[-1]] seems to be undefined as well, which I believe can be some random normalized vector?) — Erosannin
– Erosannin, Commented Mar 4 at 0:54
Uh yeah, I renamed that function a couple of times. I corrected that and changed ArnoldiBasis to Arnoldi. And yes, eArnoldi[[-1]] should have been some random vector (or some initial guess for an eigenvector). I corrected that as well. — Henrik Schumacher
– Henrik Schumacher, Commented Mar 4 at 2:25
I would like to ask for permission if that is okay for you. I have done some modifications and detailed testing of your code, and now I can get a few (hundreds for system size of 2^{18}, the trick is to choose proper Krylov dimension) eigenvalues within machine precision (I used some basic restarting techniques). By benchmarking it, I am going to use it for further calculations for my paper. Is it okay if I cite your "Tensors" package or if there is some other way to cite this? — Erosannin
– Erosannin, Commented Mar 7 at 10:54

user21 · Accepted Answer · 2025-02-25 15:15:10Z

Not an answer, but there is some matrix free linear algebra:

n = 19; s = SparseArray[{{i_, i_} -> -2., {i_, j_} /; Abs[i - j] == 1 -> 1.}, {n, n}]; b = Table[0., {n}]; b[[9]] = 1.; xk = LinearSolve[s, b, Method -> "Krylov"];

In the following I use s for simplicity, but you can have a function that constructs the vector

(* matrix free *) funA[x_] := s . x xk2 = LinearSolve[funA, b, Method -> "Krylov"];

Close enough for government work:

Norm[xk - xk2] (* 2.95134*10^-12 *)

Thanks. Do you think it is possible to do something similar when trying to find eigenvalues in mathematica? I am told that when using Python, one can call the ARPACK package such that the evaluations are matrix free. I am assuming it would be something like the above only but don't seem to get if it will work in mathematica or not? — Erosannin
– Erosannin, Commented Feb 25 at 15:31

Stack Exchange Network

Efficient way to do ( or Matrix free) Arnoldi

2 Answers 2

Linked

Hot Network Questions

Efficient way to do ( or Matrix free) Arnoldi

2 Answers 2

Linked

Related

Hot Network Questions