0

I have a long list with some values. I want to define a function that take the list and calculates the average for every 24 values in the list, and returns the average values as a list. How do I do this? I have 8760 elements in the list, and the list returned should give 8760/24=365 elements.

hourly_temp = ['-0.8', '-0.7', '-0.3', '-0.3', '-0.8', '-0.5', '-0.7', '-0.6', '-0.7', '-1.2', '-1.7...] #This goes on, it's 8760 elements def daily_mean_temp(hourly_temp): first_24_elements = hourly_temp[:24] #First 24 elements in the list 

Is this correct? I get an error saying: TypeError: cannot perform reduce with flexible type

def daily_mean_temp(hourly_temp): averages = [float(sum(myrange))/len(myrange) for myrange in zip(*[iter(hourly_temp)]*24)] return averages 
2

5 Answers 5

2

Assuming that you want independent groups, you can use the grouper itertools recipe:

def grouper(iterable, n, fillvalue=None): "Collect data into fixed-length chunks or blocks" # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx args = [iter(iterable)] * n return izip_longest(fillvalue=fillvalue, *args) 

And then easily get the average of each group:

averages = [sum(group)/float(len(group)) for group in grouper(data, 24)] 

Edit: given that your data appears to be a list of strings, I would suggest you convert to floats first using map:

data = map(float, hourly_temp) 
Sign up to request clarification or add additional context in comments.

Comments

1

Assuming your values are strings, as you show above, and that you have NumPy handy, this should be fast:

import numpy as np averages = [x.mean() for x in np.array_split( [float(x) for x in hourly_temp], 365)] 

And if you might have NaNs:

averages = [x[~np.isnan(x)].mean() for x in np.array_split( [float(x) for x in hourly_temp], 365)] 

And if you start with proper floats:

averages = [x[~np.isnan(x)].mean() for x in np.array_split(hourly_temp, 365)] 

4 Comments

is it possible to return a list with values as strings?
Just str() what you need: [str(x.mean()) for x in np.array_split([float(x) for x in hourly_temp], 365)]
Thanks, it's just that I only need one decimal. Can I reformat to only one decimal for the values?
You can round and format to 1 decimal place with format(): ['{0:.1g}'.format(x.mean()) for x in np.array_split([float(x) for x in hourly_temp], 365)]
1
averages = [sum( map(float, myrange) )/len(myrange) for myrange in zip(*[iter(my_big_list)]*range_size)] 

is a pretty neat way to do it ... note that it will truncate any end variables not nicely divisible by the range size

if you need to have uneven lists at the end (ie chunk_size of 10 with a big_list of 17 would have 7 left over)

 from itertools import izip_longest as zip2 averages = [sum(map(float,filter(None,myrange)))/len(filter(None,myrange)) for myrange in zip2(*[iter(my_big_list)]*range_size)] 

11 Comments

I'd rather not use range, a name of a built-in function, as a variable name, though here its usage does not lead to problems.
Since the list comprehension leaks the range variable into the surrounding scope, it'll cause problems if the surrounding code needs the range function.
very valid criticism (was not thinking)... changed :)
Although a bit strange, an alternative if you wanted to handle uneven chunks (or just not use the grouper recipe for a change :p), then you can use: [sum(el) / len(el) for el in iter(lambda it=iter(my_big_list): list(islice(it, range_size)), [])]
Note that an uneven block using izip_longest will pad with None and thus will cause the sum() to fail... You could change fillvalue= to be 0, but that'll distort the average. The best you could do to make it correct would be to filter None objects from groups
|
0

Something along these lines seems to work:

[ sum(hourly_temp[i:i+24]) / len(hourly_temp[i:i+24]) for i in xrange(0, len(hourly_temp), 24) ] 

3 Comments

len(hourly_temp[i:i+24]) is quite likely to be 24.
@njzk2 except when you hit the end of the list, if there wasn't a multiple of 24 items in the original list...
true. I was assuming I have 8760 elements in the list is always true, but covering the larger case is indeed better.
0

Using this grouper recipe, it's pretty easy (obviously, I've synthesized the temps list):

#!/usr/bin/python import itertools as it temps = range(96) def grouper(iterable, n, fillvalue=None): args = [iter(iterable)] * n return it.izip_longest(*args, fillvalue=fillvalue) daily_averages = [sum(x)/len(x) for x in grouper(temps, 24)] yearly_average = sum(daily_averages)/len(daily_averages) print(daily_averages, yearly_average) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.