Skip to main content
added 253 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60

Assuming that some tags (names like "ALBERT") could be missing from the first line just like they can be missing from other lines, you need a 2-pass approach to first identify all of the tags and then print the values for all of them for every line whether they are present on that line or not.

The above will output for each line the values for all of the tags (names, e.g. "ALBERT") in the order they appeared across all of the input.

The above will output for each line the values for all of the tags (names, e.g. "ALBERT") in the order they appeared across all of the input.

Assuming that some tags (names like "ALBERT") could be missing from the first line just like they can be missing from other lines, you need a 2-pass approach to first identify all of the tags and then print the values for all of them for every line whether they are present on that line or not.

The above will output for each line the values for all of the tags in the order they appeared across all of the input.

Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60

$ cat tst.awk BEGIN { OFS=";" } NR==FNR { for (i=1; i<NF; i+=3 ) { if ( !seen[$i]++ ) { tags[++numTags] = $i } } next } { delete tag2val for (i=1; i<NF; i+=3) { tag = $i val = $(i+1) FS $(i+2) tag2val[tag] = val } for (tagNr=1; tagNr<=numTags; tagNr++) { tag = tags[tagNr] val = tag2val[tag] printf "%s%s", val, (tagNr<numTags ? OFS : ORS) } } 

$ awk -f tst.awk example.txt example.txt | column -t -s';' -o'; ' some a; some b; some c; some d; some e some a; some b; ; ; some e some a; some b; ; some d; 

The above will output for each line the values for all of the tags (names, e.g. "ALBERT") in the order they appeared across all of the input.

If you want to see the tags as column headers:

$ cat tst.awk BEGIN { OFS=";" } NR==FNR { for (i=1; i<NF; i+=3 ) { if ( !seen[$i]++ ) { tags[++numTags] = $i } } next } FNR==1 { for (tagNr=1; tagNr<=numTags; tagNr++) { tag = tags[tagNr] printf "%s%s", tag, (tagNr<numTags ? OFS : ORS) } } { delete tag2val for (i=1; i<NF; i+=3) { tag = $i val = $(i+1) FS $(i+2) tag2val[tag] = val } for (tagNr=1; tagNr<=numTags; tagNr++) { tag = tags[tagNr] val = tag2val[tag] printf "%s%s", val, (tagNr<numTags ? OFS : ORS) } } 

$ awk -f tst.awk example.txt example.txt | column -t -s';' -o'; ' ALBERT; BRYAN ; CLAUDIA; DAVID ; ERIK some a; some b; some c ; some d; some e some a; some b; ; ; some e some a; some b; ; some d;