I want to calculate the squared euclidean distance between two sets of points, inputs and testing. inputs is typically a real array of size ~(200, N), whereas testing is typically ~(1e8, N), and N is around 10. The distances should be scaled in each dimension in N, so I'd be aggregating the expression scale[j]*(inputs[i,j] - testing[ii,j])**2 (where scale is the scaling vector) over N times. I am trying to make this as fast as possible, particularly as N can be large. My first test is
def old_version (inputs, testing, x0): nn, d1 = testing.shape n, d1 = inputs.shape b = np.zeros((n, nn)) for d in xrange(d1): b += x0[d] * (((np.tile(inputs[:, d], (nn, 1)) - np.tile (testing[:, d], (n, 1)).T))**2).T return b Nothing too fancy. I then tried using scipy.spatial.distance.cdist, although I still have to loop through it to get the scaling right
def new_version (inputs, testing, x0): # import scipy.spatial.distance as dist nn, d1 = testing.shape n, d1 = inputs.shape b = np.zeros ((n, nn)) for d in xrange(d1): b += x0[d] * dist.cdist(inputs[:, d][:, None], testing[:, d][:, None], 'sqeuclidean') return b It would appear that new_version scales better (as N > 1000), but I'm not sure that I've gone as fast as possible here. Any further ideas much appreciated!