5

Is there a way to avoid this loop so optimize the code?

import numpy as np cLoss = 0 dist_ = np.array([0,1,0,1,1,0,0,1,1,0]) # just an example, longer in reality TLabels = np.array([-1,1,1,1,1,-1,-1,1,-1,-1]) # just an example, longer in reality t = float(dist_.size) for i in range(len(dist_)): labels = TLabels[dist_ == dist_[i]] cLoss+= 1 - TLabels[i]*(1. * np.sum(labels)/t) print cLoss 

Note: dist_ and TLabels are both numpy arrays with the same shape (t,1)

6
  • 3
    What are you trying to accomplish? Commented Jun 7, 2015 at 10:53
  • Well I believe it's correct, TLabels[dist_ == dist_[i]] will return values from TLabels which have indices where dist_ == dist_[i]. For example let dist_ = array([2,1,2]) and TLabels=array([1,2,3]) so dist_ == dist_[0] will return array([True,False,True]) than TLabels[dist_ == dist_[0]] = array([1,3]) Commented Jun 7, 2015 at 11:01
  • Just to be clear, are the arrays (t,1) or (t,)? Where is cLoss initialized? Commented Jun 7, 2015 at 18:00
  • 1
    You need to turn this into a full running (cut and paste) example, with output. Otherwise we won't take it seriously. Commented Jun 7, 2015 at 18:09
  • Is cLoss initially 0 or []. And why the return? You aren't defining a function. Commented Jun 7, 2015 at 19:41

3 Answers 3

2

I am not sure what you exactly want to do, but are you aware of scipy.ndimage.measurements for computing on arrays with labels? It look like you want something like:

cLoss = len(dist_) - sum(TLabels * scipy.ndimage.measurements.sum(TLabels,dist_,dist_) / len(dist_)) 
Sign up to request clarification or add additional context in comments.

Comments

2

I first wonder, what is labels at each step in the loop?

With dist_ = array([2,1,2]) and TLabels=array([1,2,3])

I get

[-1 1] [1] [-1 1] 

The different length immediately raise a warning flag - it may be difficult to vectorize this.

With the longer arrays in the edited example

[-1 1 -1 -1 -1] [ 1 1 1 1 -1] [-1 1 -1 -1 -1] [ 1 1 1 1 -1] [ 1 1 1 1 -1] [-1 1 -1 -1 -1] [-1 1 -1 -1 -1] [ 1 1 1 1 -1] [ 1 1 1 1 -1] [-1 1 -1 -1 -1] 

The labels vectors are all the same length. Is that normal, or just a coincidence of values?

Drop a couple of elements off of dist_, and labels are:

In [375]: for i in range(len(dist_)): labels = TLabels[dist_ == dist_[i]] v = (1.*np.sum(labels)/t); v1 = 1-TLabels[i]*v print(labels, v, TLabels[i], v1) cLoss += v1 .....: (array([-1, 1, -1, -1]), -0.25, -1, 0.75) (array([1, 1, 1, 1]), 0.5, 1, 0.5) (array([-1, 1, -1, -1]), -0.25, 1, 1.25) (array([1, 1, 1, 1]), 0.5, 1, 0.5) (array([1, 1, 1, 1]), 0.5, 1, 0.5) (array([-1, 1, -1, -1]), -0.25, -1, 0.75) (array([-1, 1, -1, -1]), -0.25, -1, 0.75) (array([1, 1, 1, 1]), 0.5, 1, 0.5) 

Again different lengths of labels, but really only a few calculations. There is 1 v value for each different dist_ value.

Without working out all the details, it looks like you are just calculating labels*labels for each distinct dist_ value, and then summing those.

This looks like a groupBy problem. You want to divide the dist_ into groups with a common value, and sum some function of their corresponding TLabels values. Python itertools has a groupBy function, so does pandas. I think both require you to sort dist_.

Try sorting dist_ and see if that adds any clarity to the problem.

Comments

1

I'm not sure if this is any better since I didn't exactly understand why you might want to do this. Many variables in your loop are bivalued hence can be computed in advance.

Also the entries of dist_ can be used as a boolean switch but I used an explicit copy anyhow.

dist_ = np.array([0,1,0,1,1,0,0,1,1,0]) TLabels = np.array([-1,1,1,1,1,-1,-1,1,-1,-1]) t = len(dist) dist_zeros = dist_== 0 one_zero_sum = [sum(TLabels[dist_zeros])/t , sum(TLabels[~dist_zeros])/t] cLoss = sum([1-x*one_zero_sum[dist_[y]] for y,x in enumerate(TLabels)]) 

which results in cLoss = 8.2. I am using Python3 so didn't check whether this is a true division or not in Python2.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.