I have two 3-dimensional arrays, A, B, where
- A has dimensions (500 x 500 x 80), and
- B has dimensions (500 x 80 x 2000).
In both arrays the dimension that has the size 80 can be called 'time' (e.g. 80 timepoints i). The dimension that has the size 2000 can be called 'scenario' (we have 2000 scenarios).
What I need to do is to take 500 x 500 matrix A[:, :, i] and multiply by it each 500-element column vector at a corresponding time B[:, i, scenario] for each scenario and time i.
I eventually ended up with the code below
from scipy.stats import norm import numpy as np A = norm.rvs(size = (500, 500, 80), random_state = 0) B = norm.rvs(size = (500, 80, 2000), random_state = 0) result = np.einsum('ijk,jkl->ikl', A, B, optimize=True) while a naive approach would for the same problem be to use a nested for loop
for scenario in range(2000): for i in range(80): out[:, i, scenario] = A[:, :, i] @ B[:, i, scenario] I expected einsum to be quite fast because the problem 'only' involves simple operations on a large array but it actually runs for several minutes.
I compared the speed of the einsum above to the case where we assume that each matrix in A is the same, we can keep A as a (500 x 500) matrix (instead of a 3d array), and then the whole problem can be written as
A = norm.rvs(size = (500, 500), random_state = 0) B = norm.rvs(size = (500, 80, 2000), random_state = 0) result = np.einsum('ij,jkl->ikl', A, B, optimize=True) This is fast and only runs for a few seconds. Much faster than the 'slightly' more general case above.
My question is - do I write the general case with the slow einsum in a computationally efficient form?