0

I am stuck on how to format my zip function. I am aware that zip function only takes iterable objects (lists, sets, tuples, strings, iterators, etc). So far, I am trying to generate an output file that zips three float values in all separate columns. I would really appreciate getting some feedback on how else I can tackle this problem while getting the same outcome.
Fyi, ny input file has something like this..

1600 1 1700 3 1800 2.5 3000 1 7000 5 

The following is my code so far.

import numpy as np import os import csv myfiles = os.listdir('input') for file in myfiles: size=[] norm_intensity=[] with open('input/'+file, 'r') as f: data = csv.reader(f,delimiter=',') next(data) next(data) for row in data: size.append(float(row[0])) norm_intensity.append(float(row[1])) x_and_y = [] row = np.array([list (i) for i in zip(size,norm_intensity)]) for x, y in row: if y>0: x_and_y.append((x,y)) """"""""""""""""" Sum of intensity from the first pool """"""""""""""""" first_x=[] first_y= [] for x,y in (x_and_y): if x>1600 and x<2035.549: first_x.append(x) first_y.append(y) first_sum=np.sum(first_y) 

Up to this point, I am collecting y value when x is greater than 1600 but smaller than 2035.549 In a similar way, I get second sum and third sum (each has a different x range).

The following is the most troubling part so far.

first_pool=first_sum/(first_sum+second_sum+third_sum) second_pool=second_sum/(first_sum+second_sum+third_sum) third_pool=third_sum/(first_sum+second_sum+third_sum) with open ('output_pool/'+file, 'w') as f: for a,b,c in zip(first_pool,second_pool,third_pool): f.write('{0:f},{1:f},{2:f}\n'.format(a,b,c)) 

What I wanted to have at the end is the following..

first_pool second_pool third_pool (first_sum) (second_sum) (third_sum) 

Since first_pool, second_pool, third_pool are all floats, I am currently running to a message that is saying, zip argument #1 must support iteration. Do you have any suggestions that I could still achieve the goal?

1
  • Just a suggestion, you can also use Numpy to do your filtering for you: table = np.array(list(zip(size, norm_intensity))); np.sum(table[:,1][(1600 < table[:,0]) & (table[:,0] < 2035.549) & (table[:,1] > 0)]) Commented Sep 27, 2017 at 22:18

1 Answer 1

1

From what I can tell, you don't need zip. Something like the following should do what you want:

sums = [first_sum, second_sum, third_sum] pools = [first_pool, second_pool, third_pool] ... for a,b,c in [pools, sums]: f.write('{0:f},{1:f},{2:f}\n'.format(a,b,c)) 

Zipping would be, for example, if you had these two lists and wanted pairs of sums and pools:

for pool, summation in zip(pools, sums): f.write('Pool: {}, Sum: {}'.format(pool, summation)) # Pool: 0.5, Sum: 10 # Pool: 0.3, Sum: 6 # ... 
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! I guess I was overcomplicating things. I made a list of the "pools" and used it. It works great!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.