5

I have a log file. For every line with a specific number, I want to sum the last number of those lines. To grep and cut is no problem but I don't know how to sum the numbers. I tried some solutions from StackExchange but didn't get them to work in my case.

This is what I have so far:

grep "30201" logfile.txt | cut -f6 -d "|" 

30201 are the lines I'm looking for.

I want to sum the last numbers 650, 1389 and 945

The logfile.txt

Jan 09 2016|09:15:17|30201|1|SL02|650 Jan 09 2016|09:15:18|43097|1|SL01|945 Jan 09 2016|09:15:19|28774|2|SB03|1389 Jan 09 2016|09:16:21|00788|1|SL02|650 Jan 09 2016|09:17:25|03361|3|SL01|945 Jan 09 2016|09:17:33|08385|1|SL02|650 Jan 09 2016|09:18:43|10234|1|SL01|945 Jan 09 2016|09:21:55|00788|1|SL02|650 Jan 09 2016|09:24:43|03361|3|SB03|1389 Jan 09 2016|09:26:01|30201|1|SB03|1389 Jan 09 2016|09:26:21|28774|2|SL02|650 Jan 09 2016|09:26:25|00788|1|SL02|650 Jan 09 2016|09:27:21|28774|2|SL02|650 Jan 09 2016|09:29:32|30201|1|SL01|945 Jan 09 2016|09:30:12|34032|1|SB03|1389 Jan 09 2016|09:30:15|08767|3|SL02|650 

4 Answers 4

14

You can take help from paste to serialize the numbers in a format suitable for bc to do the addition:

% grep "30201" logfile.txt | cut -f6 -d "|" 650 1389 945 % grep "30201" logfile.txt | cut -f6 -d "|" | paste -sd+ 650+1389+945 % grep "30201" logfile.txt | cut -f6 -d "|" | paste -sd+ | bc 2984 

If you have grep with PCRE, you can do it with grep alone using postive lookbehind:

% grep -Po '\|30201\|.*\|\K\d+' logfile.txt | cut -f6 -d "|" | paste -sd+ | bc 2984 

With awk alone:

% awk -F'|' '$3 == 30201 {sum+=$NF}; END{print sum}' logfile.txt 2984 
  • -F'|' sets the field separator as |
  • $3 == 30201 {sum+=$NF} adds up the last field's values if the third field is 30201
  • END{print sum} prints the sum at the END
2
  • Thanks! grep "30201" logfile.txt | cut -f6 -d "|" | paste -sd+ dind't work for me, I got this errors (standard_in) 1: illegal character: ^M. But the solution with grep -Po '\|30201\|.*\|\K\d+' works. This is great! Commented Apr 13, 2019 at 11:16
  • 2
    Note that the grep solution does not care in what column the number is found, or whether the number is just a substring of a longer number. The awk solution is safer in this respect. The grep solution could be improved by first cutting and the matching the number at the start of the line (followed by |) with proper anchoring. Commented Apr 13, 2019 at 12:50
1

There is nothing really wrong with your grep and cut command. You could make it more robust by using "|30201|" as the search pattern. The issue then is dealing with the output.

Using bash:

#!/bin/bash # get the output as a bash array and add the elements nums=( $(grep "|30201|" logfile.txt | cut -f6 -d "|") ) total=0 for i in ${!nums[@]} do total=$(($total+${nums[i]})) done echo $total 
0

Bash solution.

#!/bin/bash pa=0 ; s=0 ; while read a b ; do \ if [ "$a" == "$pa" ] ; then \ s=$(($s+$b)) ; else if [ "$pa" != 0 ] ; then \ echo $pa $s ; fi ; pa=$a ; s=$b ; fi ; done < <(cat j.txt | awk -F'|' '{printf("%s %s\n",$3,$6)}' | sort -n) echo $pa $s 

Init Previous A and SUM

Cut down the input to fields 3 and 6 and sort them by number

Loop as long as field 3 stays the same, add field 6 to the SUM

if field 3 changes but the Previous A is not 0, output the Previous A and the SUM and reinit Previous A to a and SUM to last field 6 read.

Output last Previous A and SUM.

Output of the given input:

00788 1950 03361 2334 08385 650 08767 650 10234 945 28774 2689 30201 2984 34032 1389 43097 945 
1
  • Given you are using awk anyway to select the third and sixth columns, you should go the extra steps and sum things inside awk. This would give you something like awk -F'|' '{s[$3]+=$6}END{ for (i in s) { print i, s[i] }}' | sort - GNU awk has a builtin asort which could also be used rather than an external sort. Commented Apr 13, 2019 at 18:48
0

One little tool I keep around I call sumcol

#!/bin/sh # Icarus Sparry. Free for any use. C=${1:?"missing required column number"} shift awk '{s+=$'"$C"'} END { print s }' "$@" 

which adds up the whitespace delimited column you provide. Whilst I would write (as @heemayl does)

awk -F'|' '$3 == 30201 {s+=$6} END{ print s}' logfile.txt 

for the OP's problem, he could use

grep "30201" logfile.txt | cut -f6 -d "|" | sumcol 1 

or

grep "30201" logfile.txt | tr "| " " _" | sumcol 6 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.