Skip to main content
3 of 3
more straight forward
Philippos
  • 13.8k
  • 2
  • 42
  • 82

I suggest to use sed to collect lines in the hold space to check whether they appeared before:

 sed -n 'H;G;/^\(C([^)]*)\).*\1 *\n/!P' 
  • H appends the current line to the hold space
  • G appends the hold space with all lines we ever saw to the pattern space
  • C([^)]*) is one of those C(…) patterns, the ^ anchors it to the beginning of the line and it's surrounded by \(…\), so it can be backreferenced as \1 later. We need \1 *\n as pattern, with the newline (after possible whitespaces) to avoid matching the freshly appended line at the end. So the whole pattern /^\(C([^)]*)\).*\1 *\n/ matches a line with a duplicate C(…), so only if this ! doesn't match,
  • Print everything before the first newline (= without the appended hold space), while default output is suppressed by the -n option

Note that depending on you sed version and file size, this may fail because over the time, all lines will be in memory.

Philippos
  • 13.8k
  • 2
  • 42
  • 82