Return to Answer

added 253 characters in body

edited Apr 19, 2021 at 16:22

35.9k
6
25
60

Assuming that some tags (names like "ALBERT") could be missing from the first line just like they can be missing from other lines, you need a 2-pass approach to first identify all of the tags and then print the values for all of them for every line whether they are present on that line or not.

The above will output for each line the values for all of the tags (names, e.g. "ALBERT") in the order they appeared across all of the input.

Source Link

answered Apr 19, 2021 at 16:13

Ed Morton

35.9k
6
25
60

$ cat tst.awk BEGIN { OFS=";" } NR==FNR { for (i=1; i<NF; i+=3 ) { if ( !seen[$i]++ ) { tags[++numTags] = $i } } next } { delete tag2val for (i=1; i<NF; i+=3) { tag = $i val = $(i+1) FS $(i+2) tag2val[tag] = val } for (tagNr=1; tagNr<=numTags; tagNr++) { tag = tags[tagNr] val = tag2val[tag] printf "%s%s", val, (tagNr<numTags ? OFS : ORS) } }

$ awk -f tst.awk example.txt example.txt | column -t -s';' -o'; ' some a; some b; some c; some d; some e some a; some b; ; ; some e some a; some b; ; some d;

The above will output for each line the values for all of the tags (names, e.g. "ALBERT") in the order they appeared across all of the input.

If you want to see the tags as column headers:

$ cat tst.awk BEGIN { OFS=";" } NR==FNR { for (i=1; i<NF; i+=3 ) { if ( !seen[$i]++ ) { tags[++numTags] = $i } } next } FNR==1 { for (tagNr=1; tagNr<=numTags; tagNr++) { tag = tags[tagNr] printf "%s%s", tag, (tagNr<numTags ? OFS : ORS) } } { delete tag2val for (i=1; i<NF; i+=3) { tag = $i val = $(i+1) FS $(i+2) tag2val[tag] = val } for (tagNr=1; tagNr<=numTags; tagNr++) { tag = tags[tagNr] val = tag2val[tag] printf "%s%s", val, (tagNr<numTags ? OFS : ORS) } }

$ awk -f tst.awk example.txt example.txt | column -t -s';' -o'; ' ALBERT; BRYAN ; CLAUDIA; DAVID ; ERIK some a; some b; some c ; some d; some e some a; some b; ; ; some e some a; some b; ; some d;