How do I calculate the mean of a column

Question

Anyone know how can I calculate the mean of one these columns (on linux)??

sda 2.91 20.44 6.13 2.95 217.53 186.67 44.55 0.84 92.97 sda 0.00 0.00 2.00 0.00 80.00 0.00 40.00 0.22 110.00 sda 0.00 0.00 2.00 0.00 144.00 0.00 72.00 0.71 100.00 sda 0.00 64.00 0.00 1.00 0.00 8.00 8.00 2.63 10.00 sda 0.00 1.84 0.31 1.38 22.09 104.29 74.91 3.39 2291.82 sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

For example: mean(column 2)

unix.stackexchange.com/questions/13731/…

Ciro Santilli OurBigBook.com
– Ciro Santilli OurBigBook.com

2015-11-21 11:14:04 +00:00
Commented Nov 21, 2015 at 11:14 — Ciro Santilli OurBigBook.com
– Ciro Santilli OurBigBook.com, Commented Nov 21, 2015 at 11:14

porges · Accepted Answer · 2010-06-26 02:21:01Z

109

Awk:

awk '{ total += $2 } END { print total/NR }' yourFile.whatever

Read as:

For each line, add column 2 to a variable 'total'.
At the end of the file, print 'total' divided by the number of records.

answered Jun 26, 2010 at 2:21

porges

30.7k4 gold badges88 silver badges115 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

SKPS Over a year ago

@Porges: How to access specific intervals: Lets say in the second column, I want to find mean of elements 2 to 6?

porges Over a year ago

@SathishKrishnan this is a bit late, but for anyone else: you would prefix the first part with NR==2,NR==6 { total += ..... (see: gnu.org/software/gawk/manual/html_node/Ranges.html)

Chris Koknat · Accepted Answer · 2016-10-18 23:55:14Z

Perl solution:

perl -lane '$total += $F[1]; END{print $total/$.}' file

-a autosplits the line into the @F array, which is indexed starting at 0
$. is the line number

If your fields are separated by commas instead of whitespace:

perl -F, -lane '$total += $F[1]; END{print $total/$.}' file

To print mean values of all columns, assign totals to array @t:

perl -lane 'for $c (0..$#F){$t[$c] += $F[$c]}; END{for $c (0..$#t){print $t[$c]/$.}}'

output:

0 0.485 14.38 1.74 0.888333333333333 77.27 49.8266666666667 39.91 1.29833333333333 434.131666666667

Community · Accepted Answer · 2017-05-23 11:46:34Z

You can use python for that, is available in Linux.

If that comes from a file, take a look at this question, just use float instead.

For instance:

#mean.py def main(): with open("mean.txt", 'r') as f: data = [map(float, line.split()) for line in f] columnTwo = [] for row in data: columnTwo.append( row[1] ) print sum(columnTwo,0.0) / len( columnTwo ) if __name__=="__main__": main()

Prints 14.38

_{I just include the data in the mean.txt file, not the row header: "sda"}

My first thought would probably have been Python as well... but making the list might be overly inefficient here, since you only really need the sum and the number of lines. (Also, for the fun of it: with open("mean.txt", 'r') as f: n,t = map(sum, zip(*((1, float(line.split()[1])) for line in f))); print t/n)

2 revs · Accepted Answer · 2017-05-23 12:10:25Z

David Zaslavsky for the fun of it:

with open("mean.txt", 'r') as f: n,t = map(sum, zip(*((1, float(line.split()[1])) for line in f))) print t/n

kenorb · Accepted Answer · 2016-06-28 09:21:57Z

Simple-r will calculate the mean with the following line:

r -k2 mean file.txt

for the second column. It can also do much more sophisticated statistical analysis, since it uses R environment for all of its statistical analysis.

Collectives™ on Stack Overflow

How do I calculate the mean of a column

5 Answers 5

2 Comments

Comments

1 Comment

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

Comments

1 Comment

Comments

Comments

Linked

Related