45

Anyone know how can I calculate the mean of one these columns (on linux)??

sda 2.91 20.44 6.13 2.95 217.53 186.67 44.55 0.84 92.97 sda 0.00 0.00 2.00 0.00 80.00 0.00 40.00 0.22 110.00 sda 0.00 0.00 2.00 0.00 144.00 0.00 72.00 0.71 100.00 sda 0.00 64.00 0.00 1.00 0.00 8.00 8.00 2.63 10.00 sda 0.00 1.84 0.31 1.38 22.09 104.29 74.91 3.39 2291.82 sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 

For example: mean(column 2)

1

5 Answers 5

109

Awk:

awk '{ total += $2 } END { print total/NR }' yourFile.whatever 

Read as:

  • For each line, add column 2 to a variable 'total'.
  • At the end of the file, print 'total' divided by the number of records.
Sign up to request clarification or add additional context in comments.

2 Comments

@Porges: How to access specific intervals: Lets say in the second column, I want to find mean of elements 2 to 6?
@SathishKrishnan this is a bit late, but for anyone else: you would prefix the first part with NR==2,NR==6 { total += ..... (see: gnu.org/software/gawk/manual/html_node/Ranges.html)
4

Perl solution:

perl -lane '$total += $F[1]; END{print $total/$.}' file 

-a autosplits the line into the @F array, which is indexed starting at 0
$. is the line number

If your fields are separated by commas instead of whitespace:

perl -F, -lane '$total += $F[1]; END{print $total/$.}' file 

To print mean values of all columns, assign totals to array @t:

perl -lane 'for $c (0..$#F){$t[$c] += $F[$c]}; END{for $c (0..$#t){print $t[$c]/$.}}' 

output:

0 0.485 14.38 1.74 0.888333333333333 77.27 49.8266666666667 39.91 1.29833333333333 434.131666666667 

Comments

1

You can use python for that, is available in Linux.

If that comes from a file, take a look at this question, just use float instead.

For instance:

#mean.py def main(): with open("mean.txt", 'r') as f: data = [map(float, line.split()) for line in f] columnTwo = [] for row in data: columnTwo.append( row[1] ) print sum(columnTwo,0.0) / len( columnTwo ) if __name__=="__main__": main() 

Prints 14.38

I just include the data in the mean.txt file, not the row header: "sda"

1 Comment

My first thought would probably have been Python as well... but making the list might be overly inefficient here, since you only really need the sum and the number of lines. (Also, for the fun of it: with open("mean.txt", 'r') as f: n,t = map(sum, zip(*((1, float(line.split()[1])) for line in f))); print t/n)
0

David Zaslavsky for the fun of it:

with open("mean.txt", 'r') as f: n,t = map(sum, zip(*((1, float(line.split()[1])) for line in f))) print t/n 

Comments

0

Simple-r will calculate the mean with the following line:

r -k2 mean file.txt 

for the second column. It can also do much more sophisticated statistical analysis, since it uses R environment for all of its statistical analysis.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.