Since awk arrays are indexed by strings, you can use one array to keep the total price for that brand so far, and use another array to keep the count of records seen for that brand.
Because "brand" is field 4, you can index the arrays in awk like this:
total_price[$4] += $3 # accumulate total price for this brand count[$4] += 1 # increment count of records for this brand At the end, loop through the keys to the arrays, and format the output while calculating the averages.
Since POSIX awk contains no sort function, pipe the output of the awk command to the standard Unix sort command.
Please try this:
#!/bin/sh #first_name,last_name,price_paid,brand,year #print for each brand, the average price paid awk -F, ' NR == 1 { next # skip header } { price_paid[$4] += $3 # accumulate total price for this brand count[$4] += 1 # increment count of records for this brand } END { for (brand in price_paid) { printf "%s,%7.2f\n", brand, price_paid[brand] / count[brand] } } ' < "${1:?filename required}" | sort Invoke the
awkcommand, setting the Field Separator to comma (,) and passing everything between the single quote on this line and the next single quote several lines below, as the script:awk -F, 'Skip Header: If the current record number is 1, then skip all processing on the current line (the first line), and get the next line of input:
NR == 1 { next # skip header }Accumulate Price Total Per Brand (this is executed on every line):
The arraysprice_paidandcountare indexed by thebrandstring.
Add the current price paid ($3) to the price_paid total for this brand.
Increment the count of records for this brand:{ price_paid[$4] += $3 # accumulate total price for this brand count[$4] += 1 # increment count of records for this brand }Print the Output Table: After all input is processed, step through the keys (
brand) to theprice_paidarray, and for eachbrand, print thebrandand the average ofprice_paidfor thatbrand:END { for (brand in price_paid) { printf "%s,%7.2f\n", brand, price_paid[brand] / count[brand] } }Terminate the script argument, redirect input from the filename parameter, and pipe the output of the
awkcommand to thesortcommand:' < "${1:?filename required}" | sort
The single quote (') terminates the script argument to awk. < "${1:?filename required}".
redirects the standard< input"${1:?filename ofrequired}"awk` redirects the standard input of awk from the filename specified by the first command line parameter to the script. If there is no parameter, then the shell will print an error message containing "filename required" and exit with error status.