1

I have an input file in this form:

foo bar 08 320984 2384 bla foo baz 23 32425 32532 [...] 

there are always three tokens in the end, but an unknown number of tokens in the front. I want to rewrite the file as CSV so it can be automatically parsed by other applications. my current awk command is:

awk '{ print $(NF-2)";"$(NF-1)";"$NF}' 

the output should be

foo bar;08;320984;2384 bla foo baz;23;32425;32532 [...] 
6
  • That's not CSV but that's not the point. You are trying to split your file into four output fields? Where the last three fields are the last three fields of the input and the first field is everything else on the line? Commented Nov 4, 2014 at 21:46
  • 1
    How do you know what is token? Any text with no numbers? You have two words in one line and three words in another. Commented Nov 4, 2014 at 22:15
  • @Jotne from the sentence "there are always three tokens in the end, but an unknown number of tokens in the front" I understand it is always like tk1 tk2 tk3 ... tkn TK1 TK2 TK3 and has to become tk1 tk2 tk3 ... tkn;TK1;TK2;TK3. Commented Nov 5, 2014 at 9:00
  • @fedorqui So in short, add ; for the last thee fields? Commented Nov 5, 2014 at 9:27
  • @Jotne in front each one of the last three, exactly. Commented Nov 5, 2014 at 9:28

6 Answers 6

2

This is unfortunately something that awk just isn't the greatest at (and cut's ability to do field ranges doesn't help here either.

Something like this should work though:

awk '{nfff=$(NF-2); nff=$(NF-1); nf=$NF; NF-=3; printf "%s;%s;%s;%s\n", $0, nfff, nff, nf}' file 
Sign up to request clarification or add additional context in comments.

1 Comment

A small variation on your approach: $ awk '{ last=";"$(NF-2)";"$(NF-1)";"$NF; NF-=3; print $0 last}' file
1

If I understand you and fedorqui properly:

awk '{for (i=1;i<NF;i++) printf "%s%s",$i,(i+4>NF?";":FS);print $NF}' file foo bar;08;320984;2384 bla foo baz;23;32425;32532 

This will add ; in front of the last three fields.

John's comment may be better way to do it.

Comments

1

sed could also work:

sed 's/\ \([^\ ]\+\)\ \([^\ ]\+\)\ \([^\ ]\+\)$/;\1;\2;\3/' file 

or if your sed supports -r:

sed -r 's/\ ([^\ ]+)\ ([^\ ]+)\ ([^\ ]+)$/;\1;\2;\3/' file 

It replaces the last 3 newlines with ;.

Or a bit easier:

rev file | sed 's/\ /;/g; s/;/\ /g4' | rev 

Comments

1

A fancy GNU awk method:

gawk ' function replace(what) { return gensub(/[[:blank:]]+([^[:blank:]]+)$/, ";\\1", 1, what) } {$0 = replace(replace(replace($0))); print} ' file 
foo bar;08;320984;2384 bla foo baz;23;32425;32532 

Comments

1

This should do it for an arbitrary number of fields before the last three:

awk '{for (i=1; i <= NF - 3; i++) if (i == 1) printf $i; else printf " "$i} {print ";"$(NF-2)";"$(NF-1)";"$NF}' input 

Comments

0

I am new to awk, but how about this (this will not remove the blank spaces though.):

awk '{for (i=0; i<3; i++) {$(NF-i)=";" $(NF-i)} print $0} ' file 

Example:

sdlcb@Goofy-Gen:~/AMD$ cat file foo bar 08 320984 2384 bla foo baz 23 32425 32532 sdlcb@Goofy-Gen:~/AMD$ awk '{for (i=0; i<3; i++) {$(NF-i)=";" $(NF-i)} print $0} ' file foo bar ;08 ;320984 ;2384 bla foo baz ;23 ;32425 ;32532 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.