Skip to main content
Just playing with awk
Source Link
bu5hman
  • 4.9k
  • 2
  • 16
  • 29

A little awk, a little regex, and tidy the whitespace up by piping to column

awk -F',' '{if ( $4 ~ /[Rr]ead/ && $5 > 20 || NR==1) print $5, $9}' data.csv | column -t 

Explanation....after setting the field delimiter to a , with -F','

....if the 4th field has a regex match ~ with 'Read' or 'read' and && the 5th field is > 20 or ||we are on the first row (with the titles) NR==1 then print out the columns you are interested in......

Just for fun

If you know the column headers but are too lazy to count....

Load the headers into an associative array

declare -A HEADS=( [mess]=mess [id]=ID [score]=score ) 

.....awk out the column indices from the first row of your data file into the array

for j in "${!HEADS[@]}"; do HEADS[$j]=$(awk -F',' -v s=${HEADS[$j]} 'NR==1 {for (i=1; i<=NF; ++i) { if ($i ~ s ) print i }}' data.csv) ; done 

... back to the top just injecting the indices into awk as variables

awk -v mess=${HEADS[mess]} -v score=${HEADS[score]} -v id=${HEADS[id]} -F',' '{if ( $mess ~ /[Rr]ead/ && $score >20 || NR==1) print $score, $id}' data.csv | column -t 

A little awk, a little regex, and tidy the whitespace up by piping to column

awk -F',' '{if ( $4 ~ /[Rr]ead/ && $5 > 20 || NR==1) print $5, $9}' data.csv | column -t 

Explanation....after setting the field delimiter to a , with -F','

....if the 4th field has a regex match ~ with 'Read' or 'read' and && the 5th field is > 20 or ||we are on the first row (with the titles) NR==1 then print out the columns you are interested in......

A little awk, a little regex, and tidy the whitespace up by piping to column

awk -F',' '{if ( $4 ~ /[Rr]ead/ && $5 > 20 || NR==1) print $5, $9}' data.csv | column -t 

Explanation....after setting the field delimiter to a , with -F','

....if the 4th field has a regex match ~ with 'Read' or 'read' and && the 5th field is > 20 or ||we are on the first row (with the titles) NR==1 then print out the columns you are interested in......

Just for fun

If you know the column headers but are too lazy to count....

Load the headers into an associative array

declare -A HEADS=( [mess]=mess [id]=ID [score]=score ) 

.....awk out the column indices from the first row of your data file into the array

for j in "${!HEADS[@]}"; do HEADS[$j]=$(awk -F',' -v s=${HEADS[$j]} 'NR==1 {for (i=1; i<=NF; ++i) { if ($i ~ s ) print i }}' data.csv) ; done 

... back to the top just injecting the indices into awk as variables

awk -v mess=${HEADS[mess]} -v score=${HEADS[score]} -v id=${HEADS[id]} -F',' '{if ( $mess ~ /[Rr]ead/ && $score >20 || NR==1) print $score, $id}' data.csv | column -t 
Source Link
bu5hman
  • 4.9k
  • 2
  • 16
  • 29

A little awk, a little regex, and tidy the whitespace up by piping to column

awk -F',' '{if ( $4 ~ /[Rr]ead/ && $5 > 20 || NR==1) print $5, $9}' data.csv | column -t 

Explanation....after setting the field delimiter to a , with -F','

....if the 4th field has a regex match ~ with 'Read' or 'read' and && the 5th field is > 20 or ||we are on the first row (with the titles) NR==1 then print out the columns you are interested in......