How can I quickly sum all numbers in a file?

Question

Each line contains text and numbers in one column. I need to calculate the sum of the numbers in each row. How can I do that? Thx

example.log contains:

time=31sec time=192sec time=18sec time=543sec

The answer should be 784

I tried this method awk '{ sum += $1}; END { print sum }' example.log but it's only for numbers in line — Jack
– Jack, Commented May 27, 2015 at 18:13
There is almost the same question in Stack Overflow: How can I quickly sum all numbers in a file?. Maybe time to have cross-site duplicates? — fedorqui
– fedorqui, Commented May 28, 2015 at 7:26

cuonglm · Accepted Answer · 2015-05-28 01:26:01Z

If your grep support -o option, you can try:

$ grep -o '[[:digit:]]*' file | paste -sd+ - | bc 784

POSIXly:

$ printf %d\\n "$(( $(tr -cs 0-9 '[\n*]' <file | paste -sd+ -) ))" 784

Janis · Accepted Answer · 2015-05-27 18:19:27Z

16

With a newer version (4.x) of GNU awk:

awk 'BEGIN {FPAT="[0-9]+"}{s+=$1}END{print s}'

With other awks try:

awk -F '[a-z=]*' '{s+=$2}END{print s}'

edited May 27, 2015 at 18:19

answered May 27, 2015 at 18:14

Janis

14.4k4 gold badges28 silver badges42 bronze badges

4

You need s+0 in case where s is empty, it will print 0 instead of empty.

cuonglm
– cuonglm

2015-05-27 18:20:35 +00:00
Commented May 27, 2015 at 18:20
Let me explain that. - There is just one case where s can be empty; if the input data contains no lines (i.e. if there is no input at all). In that case there are two behaviours possible; 1) no input => no output, or 2) always output something, if only 0. Both are sensible options depending on the application context. The +0 is addressing option 2). To address option 1) you'd rather have to write END {if(s) print s}. - Therefore it makes no sense to assume either option (for this corner case of no data) until it is specified by the question.

Janis
– Janis

2015-05-28 12:40:01 +00:00
Commented May 28, 2015 at 12:40

Add a comment |

Stéphane Chazelas · Accepted Answer · 2015-05-28 21:43:07Z

10

awk -F= '{sum+=$2};END{print sum}'

edited May 28, 2015 at 21:43

Stéphane Chazelas

586k96 gold badges1.1k silver badges1.7k bronze badges

answered May 28, 2015 at 4:38

snth

3111 silver badge3 bronze badges

2

We prefer long form answers. Can you please elaborate on how this works?

slm
– slm ♦

2015-05-28 09:31:36 +00:00
Commented May 28, 2015 at 9:31
2

@slm, that answer is not any more or less verbose than the other answers here and is self explanatory. It also has the advantage of working with input like time=1.4e5sec

Stéphane Chazelas
– Stéphane Chazelas

2015-05-28 21:42:33 +00:00
Commented May 28, 2015 at 21:42
@StéphaneChazelas - agreed, but this is a new user and we do encourage users to provide more than single line answers. A bit of text explaining how it works would make it a much stronger answer than just code.

slm
– slm ♦

2015-05-28 21:47:02 +00:00
Commented May 28, 2015 at 21:47
4

@slm, this is a new user with one of the best answers (from a technical stand point) and he gets two downvotes and a negative comment. Not a very warm welcome.

Stéphane Chazelas
– Stéphane Chazelas

2015-05-28 21:52:08 +00:00
Commented May 28, 2015 at 21:52
1

@TomFenech, the POSIX syntax for awk requires that those pattern/action items be separated by either ";" or "newline", so you may find awk implementations where it fails without this ";".

Stéphane Chazelas
– Stéphane Chazelas

2015-05-29 09:28:37 +00:00
Commented May 29, 2015 at 9:28

| Show 4 more comments

Stéphane Chazelas · Accepted Answer · 2015-09-28 20:33:52Z

Another GNU awk one:

awk -v RS='[0-9]+' '{n+=RT};END{print n}'

A perl one:

perl -lne'$n+=$_ for/\d+/g}{print$n'

A POSIX one:

tr -cs 0-9 '[\n*]' | grep . | paste -sd + - | bc

don_crissti · Accepted Answer · 2015-05-27 21:31:36Z

6

sed 's/=/ /' file | awk '{ sum+=$2 } END { print sum}'

edited May 27, 2015 at 21:31

don_crissti

85.7k31 gold badges234 silver badges262 bronze badges

answered May 27, 2015 at 21:07

user2570505

691 bronze badge

Awesome answer, but no need for sed: awk --field-separator = '{ sum+=$2 } END { print sum}' data.dat

user1717828
– user1717828

2015-05-27 23:45:03 +00:00
Commented May 27, 2015 at 23:45
@user1717828: you should rather use the (shorter, and more compatible!) -F'=' instead of --field-separator =

Olivier Dulac
– Olivier Dulac

2015-05-29 09:14:42 +00:00
Commented May 29, 2015 at 9:14
@OlivierDulac, weird, my man awk only gives -F fs and --field-separator fs

user1717828
– user1717828

2015-05-29 10:40:14 +00:00
Commented May 29, 2015 at 10:40
@user1717828: -F'=' or -F '=' are 2 ways of doing the -F fs (fs is "=" in your case) . I added the singlequotes to ensure the fs is properly seen & interpreted by awk, not the shell (usefull if the fs is ';' for example)

Olivier Dulac
– Olivier Dulac

2015-05-29 11:56:50 +00:00
Commented May 29, 2015 at 11:56

Add a comment |

cuonglm · Accepted Answer · 2015-05-27 18:18:31Z

4

You can try this:

awk -F"[^0-9]+" '{ sum += $2 } END { print sum+0; }' file

edited May 27, 2015 at 18:18

cuonglm

158k41 gold badges342 silver badges420 bronze badges

answered May 27, 2015 at 18:17

taliezin

9,4751 gold badge37 silver badges39 bronze badges

Add a comment |

Stephen Quan · Accepted Answer · 2015-05-27 23:45:14Z

Everyone has posted awesome awk answers, which I like very much.

A variation to @cuonglm replacing grep with sed:

sed 's/[^0-9]//g' example.log | paste -sd'+' - | bc

The sed strips everything except for the numbers.
The paste -sd+ - command joins all the lines together as a single line
The bc evaluates the expression

mikeserv · Accepted Answer · 2015-05-28 04:53:46Z

3

You should use a calculator.

{ tr = \ | xargs printf '[%s=]P%d+p' | dc; } <infile 2>/dev/null

With your four lines that prints:

time=31 time=223 time=241 time=784

And more simply:

tr times=c ' + p' <infile |dc

...which prints...

31 223 241 784

If speed is what you're after then dc is what you want. Traditionally it was bc's compiler - and still is for many systems.

edited May 28, 2015 at 4:53

answered May 28, 2015 at 4:38

mikeserv

59.4k10 gold badges123 silver badges243 bronze badges

Not according to my measurements: it depends how much work you have to do to generate the formula

glenn jackman
– glenn jackman

2015-05-28 13:24:40 +00:00
Commented May 28, 2015 at 13:24
@glennjackman - your measurements don't include dc as near as I can tell. What are you talking about?

mikeserv
– mikeserv

2015-05-28 15:14:15 +00:00
Commented May 28, 2015 at 15:14
By the way, when comparing the old crew to the new crew - such as when you benchmark perl v the standard unix toolset - it really doesn't make much sense if you use GNU tools compiled on a GNU toolchain. All of the bloat that can negatively affect Perl's performance is also in all of those GNU-compiled GNU utils. Sad but true. You need a real, simply built, simple toolset to accurately judge the difference. Like an heirloom-toolchest set statically linked against musl libs for instance - in that way you can bench the one-tool/one-job paradigm vs the one-tool-to-rule-them-all one.

mikeserv
– mikeserv

2015-05-28 15:27:07 +00:00
Commented May 28, 2015 at 15:27

Add a comment |

Avinash Raj · Accepted Answer · 2015-05-29 04:10:08Z

3

Through python3,

import re with open(file) as f: m = f.read() l = re.findall(r'\d+', m) print(sum(map(int, l)))

edited May 29, 2015 at 4:10

answered May 28, 2015 at 4:14

Avinash Raj

3,7594 gold badges23 silver badges35 bronze badges

re.findall returns a list of strings, this is not going to work

iruvar
– iruvar

2015-05-28 22:28:02 +00:00
Commented May 28, 2015 at 22:28
@1_CR ya , I forget that. Check it now.

Avinash Raj
– Avinash Raj

2015-05-29 04:10:27 +00:00
Commented May 29, 2015 at 4:10
Maybe sum(int(e) for e in l) is more pythonic.

cuonglm
– cuonglm

2015-05-29 15:18:17 +00:00
Commented May 29, 2015 at 15:18

Add a comment |

cuonglm · Accepted Answer · 2015-06-06 07:26:59Z

Pure bash solution (Bash 3+):

while IFS= read -r line; do # While it reads a line: if [[ "$line" =~ [0-9]+ ]]; then # If the line contains numbers: ((counter+=BASH_REMATCH[0])) # Add the current number to counter fi # End if. done # End loop. echo "Total number: $counter" # Print the number. unset counter # Reset counter to 0.

Short version:

while IFS= read -r l; do [[ "$l" =~ [0-9]+ ]] && ((c+=BASH_REMATCH)); done; echo $c; c=0

Maybe also: PS4='$((x+=${time%s*}))' time=0 x=0 sh -x <infile — mikeserv
– mikeserv, Commented May 31, 2015 at 9:39

Stack Exchange Network

How can I quickly sum all numbers in a file?

10 Answers 10

You must log in to answer this question.

Hot Network Questions

How can I quickly sum all numbers in a file?

10 Answers 10

You must log in to answer this question.

Related

Hot Network Questions