AWK: how to select lines by the number of words in one field?

Question

Here's a text file I have:

1|this|1000 2|that|2000 3|hello|3000 4|hello world|4000 5|lucky you|5000 6|awk is awesome|6000 . . .

How do I only print the lines that have two and only two words (line 4 and 5) in the $2?

This is what I have tried but it counts the number of letters instead of words:

awk -F"|" '{if(length($2==2) print $0}'

steeldriver · Accepted Answer · 2017-10-31 00:13:48Z

You can use the return value of the awk split function:

$ awk -F'|' 'split($2,a,"[ \t]+") == 2' file 4|hello world|4000 5|lucky you|5000

αғsнιη · Accepted Answer · 2017-10-31 09:23:14Z

You could also use return value of gsub function instead.

awk -F'|' '{l=$0} gsub(/[ \t]+/,"",$2)==1{print l}'

MiniMax · Accepted Answer · 2017-10-31 15:06:40Z

awk '/^.+\|\w+ \w+\|/' input.txt

Explanation:

'/^.+\|\w+ \w+\|/' - all lines conforming this pattern will be printed.
^ - starting from the beginning of the line.
.+ - one or more any characters.
\| - pipe character. Should be escaped by the backslash for perceiving literally, else it processed as 'or' sign.
\w+ \w+\ - any word characters, then space, then any word characters or, in other words: word space word - exactly, what you need.
\| - the second pipe character.

Input

1|this|1000 2|that|2000 3|hello|3000 4|hello world|4000 5|lucky you|5000 6|awk is awesome|6000

Output

4|hello world|4000 5|lucky you|5000

3 Answers 3