Skip to main content
4 of 4
it's not perl, it's only perl compatible
choroba
  • 49.6k
  • 7
  • 92
  • 119

grep uses Posix Basic regex by default.

.* in basic regex is always a greedy match, meaning it matches anything until the last " in the line.

You can use [^"]* instead to match anything except ".

grep -o 'Competition="[^"]*"' 'Soccer_Data.xml' | sort --unique 

Output:

Competition="FA Cup" 

Alternatively, use perl compatible regex that provides non-greedy modifier (.*?).
You can use grep -P if your version of grep provides that (and it will, as you have added [ubuntu] tag to your question).

grep -Po 'Competition=".*?"' 'Soccer_Data.xml' | sort --unique 

or to receive only FA CUP using e.g. "Keep-out" --> \K:

grep -Po 'Competition="\K[^"]*' 'Soccer_Data.xml' | sort --unique 

Output:

FA Cup 
pLumo
  • 23.2k
  • 2
  • 43
  • 70