grep uses Posix Basic regex by default.
.* in basic regex is always a greedy match, meaning it matches anything until the last " in the line.
You can use [^"]* instead to match anything except ".
grep -o 'Competition="[^"]*"' 'Soccer_Data.xml' | sort --unique Output:
Competition="FA Cup" Alternatively, use perl compatible regex that provides non-greedy modifier (.*?).
You can use grep -P if your version of grep provides that (and it will, as you have added [ubuntu] tag to your question).
grep -Po 'Competition=".*?"' 'Soccer_Data.xml' | sort --unique or to receive only FA CUP using e.g. "Keep-out" --> \K:
grep -Po 'Competition="\K[^"]*' 'Soccer_Data.xml' | sort --unique Output:
FA Cup