I first wonder, what is labels at each step in the loop?
With dist_ = array([2,1,2]) and TLabels=array([1,2,3])
I get
[-1 1] [1] [-1 1]
The different length immediately raise a warning flag - it may be difficult to vectorize this.
With the longer arrays in the edited example
[-1 1 -1 -1 -1] [ 1 1 1 1 -1] [-1 1 -1 -1 -1] [ 1 1 1 1 -1] [ 1 1 1 1 -1] [-1 1 -1 -1 -1] [-1 1 -1 -1 -1] [ 1 1 1 1 -1] [ 1 1 1 1 -1] [-1 1 -1 -1 -1]
The labels vectors are all the same length. Is that normal, or just a coincidence of values?
Drop a couple of elements off of dist_, and labels are:
In [375]: for i in range(len(dist_)): labels = TLabels[dist_ == dist_[i]] v = (1.*np.sum(labels)/t); v1 = 1-TLabels[i]*v print(labels, v, TLabels[i], v1) cLoss += v1 .....: (array([-1, 1, -1, -1]), -0.25, -1, 0.75) (array([1, 1, 1, 1]), 0.5, 1, 0.5) (array([-1, 1, -1, -1]), -0.25, 1, 1.25) (array([1, 1, 1, 1]), 0.5, 1, 0.5) (array([1, 1, 1, 1]), 0.5, 1, 0.5) (array([-1, 1, -1, -1]), -0.25, -1, 0.75) (array([-1, 1, -1, -1]), -0.25, -1, 0.75) (array([1, 1, 1, 1]), 0.5, 1, 0.5)
Again different lengths of labels, but really only a few calculations. There is 1 v value for each different dist_ value.
Without working out all the details, it looks like you are just calculating labels*labels for each distinct dist_ value, and then summing those.
This looks like a groupBy problem. You want to divide the dist_ into groups with a common value, and sum some function of their corresponding TLabels values. Python itertools has a groupBy function, so does pandas. I think both require you to sort dist_.
Try sorting dist_ and see if that adds any clarity to the problem.
TLabels[dist_ == dist_[i]]will return values fromTLabelswhich have indices wheredist_ == dist_[i]. For example letdist_ = array([2,1,2])andTLabels=array([1,2,3])sodist_ == dist_[0]will returnarray([True,False,True])thanTLabels[dist_ == dist_[0]] = array([1,3])(t,1)or(t,)? Where iscLossinitialized?cLossinitially0or[]. And why thereturn? You aren't defining a function.