5

Here's a text file I have:

1|this|1000 2|that|2000 3|hello|3000 4|hello world|4000 5|lucky you|5000 6|awk is awesome|6000 . . . 

How do I only print the lines that have two and only two words (line 4 and 5) in the $2?

This is what I have tried but it counts the number of letters instead of words:

awk -F"|" '{if(length($2==2) print $0}' 

3 Answers 3

16

You can use the return value of the awk split function:

$ awk -F'|' 'split($2,a,"[ \t]+") == 2' file 4|hello world|4000 5|lucky you|5000 
1

You could also use return value of gsub function instead.

awk -F'|' '{l=$0} gsub(/[ \t]+/,"",$2)==1{print l}' 
1
awk '/^.+\|\w+ \w+\|/' input.txt 

Explanation:

  • '/^.+\|\w+ \w+\|/' - all lines conforming this pattern will be printed.
  • ^ - starting from the beginning of the line.
  • .+ - one or more any characters.
  • \| - pipe character. Should be escaped by the backslash for perceiving literally, else it processed as 'or' sign.
  • \w+ \w+\ - any word characters, then space, then any word characters or, in other words: word space word - exactly, what you need.
  • \| - the second pipe character.

Input

1|this|1000 2|that|2000 3|hello|3000 4|hello world|4000 5|lucky you|5000 6|awk is awesome|6000 

Output

4|hello world|4000 5|lucky you|5000 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.