4

I had a csv file of data such like that when read into shell:

name,income,reward,payment Jackson,10000,2000,1000 Paul,2500,700,200 Louis,5000,100,1800 

and I want to find the net earning for each person, use formula: "net = income+reward-payment".

when I used command to do this, it only calculate the first row of data.

$ cat data.csv | awk -F ',' '{for (i=1;i<=NF;i++) net[i] = $2+$3-$4} END {for (p in total) print p, "net = ", net[p]}' > result.txt 

How can I do the calculation here?

By the way, the names are not unique, so I try (for loop) to create index for the array [net].

My expected output is:

1 Jackson net = 11000 2 Paul net = 3000 3 Louis net = 3300 
0

3 Answers 3

5
$ awk -F, -v OFS=, 'NR>1 { print $1, $2+$3-$4 }' data.csv Jackson,11000 Paul,3000 Louis,3300 

Or if you want the net appended to the existing data, along with the (updated) header line:

$ awk -F, -v OFS=, 'NR==1 {print $0,"net"}; NR>1 {print $0, $2+$3-$4}' data.csv name,income,reward,payment,net Jackson,10000,2000,1000,11000 Paul,2500,700,200,3000 Louis,5000,100,1800,3300 
2
  • Assuming the CSV is really simple (no escaped commas, etc.)… Commented May 3, 2021 at 3:21
  • 1
    @D.BenKnoble yep. To parse CSV properly (with quotes and escapes etc), you need a real CSV parser. Either a tool like csvkit, or a CSV-parsing library for perl (e.g. Text::CSV) or python (e.g. csv) or whatever language you're using. This answer is only good for simple comma-delimited input like in the OP's example, not full CSV. Commented May 3, 2021 at 7:32
4

do in this way:

$ awk 'BEGIN{ FS=OFS="," } { $5=NR>1?$2+$3-$4:"net" }1' infile name,income,reward,payment,net Jackson,10000,2000,1000,11000 Paul,2500,700,200,3000 Louis,5000,100,1800,3300 

or having your updated expected output:

$ awk -F, 'NR>1{ print NR-1, $1, "net= " $2+$3-$4 }' ifnile 1 Jackson net= 11000 2 Paul net= 3000 3 Louis net= 3300 
3
  • What is the 1 doing in the }1 part? Commented May 2, 2021 at 10:15
  • 1
    @nit2qubit read unix.stackexchange.com/q/63891/72456 Commented May 2, 2021 at 10:35
  • " 1 is the condition, which always evaluates to true since it is nonzero. The action is omitted, so the default {print} is performed." Commented May 2, 2021 at 12:50
1

Using awk:

awk -F',' '{ p[NR] = $1;net[NR] = $2+$3-$4} END {for (i=2;i<=NR;i++) print (i-1), p[i], "net = ", net[i]}' data.csv

Because this is a csv file, FS = ',' or -F ','

Next p and net arrays created. These are indexed on NR(record numbers).

Then a for loop (i-1), p[i], "net = ", net[i] works as you planned. for loop is being started from 2 because first line is header line.

pipe in this case is useless use. Anohter thing your loop is good but limit should be NR not NF.

1
  • 1
    @EdMorton Rolled back to revision one. Thanks for pointing out. Commented May 3, 2021 at 12:11

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.