Skip to main content
added 456 characters in body
Source Link
mikeserv
  • 59.4k
  • 10
  • 122
  • 242

And actually, I guess, there are three ways. The third might look like:

sed 'H;$!d;x;s/\(\n\*\** *\n\(\([0-9./: ]*\n\)*\)\)*./\2/g' 

...which reads in the whole file and then globally substitutes away every character which doesn't fall within the specifications of the matched lines. It prints the same as before, but those are a pain to write, and they're only safe performance-wise when you balance the optionals against any character.

And actually, I guess, there are three ways. The third might look like:

sed 'H;$!d;x;s/\(\n\*\** *\n\(\([0-9./: ]*\n\)*\)\)*./\2/g' 

...which reads in the whole file and then globally substitutes away every character which doesn't fall within the specifications of the matched lines. It prints the same as before, but those are a pain to write, and they're only safe performance-wise when you balance the optionals against any character.

added 1 character in body
Source Link
mikeserv
  • 59.4k
  • 10
  • 122
  • 242
{ grep -xm1 '*\*'*\** *' >&2 sed -n '/^\( *<np>.*\)*$/q;p' } <infile 2>/dev/null >outfile 
{ grep -xm1 '*\* *' >&2 sed -n '/^\( *<np>.*\)*$/q;p' } <infile 2>/dev/null >outfile 
{ grep -xm1 '*\** *' >&2 sed -n '/^\( *<np>.*\)*$/q;p' } <infile 2>/dev/null >outfile 
deleted 162 characters in body
Source Link
mikeserv
  • 59.4k
  • 10
  • 122
  • 242
  • x
    • Swaps hold and pattern spaces. This institutes a look-behind - sed is always one-line behind input - and the first line is always blank.
  • /^\( *<np>.*\)*$/
    • This selects a stop printing before here line that matches from head to tail zero or more occurrences in the match group. Two kinds of lines can match zero or more occurrences of that - either a blank line or one with starting with someany number of <<!>spaces<!>> at the head of the line followed by the string <np>.
  • /^*\** *$/
    • This selects a start printing after here line which opens with at least one * asterisk character and continues to the end of the line with only zero or more occurrences of the * asterisk and possibly closed by any number of spaces.
  • c\' -e ''
    • This changes the entire blocked selection to a selectionsingle blank line - so any number of occuring paragraphs that fit between, squeezing all unwanted lines to the range pattern will be separated by only a blank linestring EOF.

.. So any number of lines occurring before ^*\** *$ and after the first following ^\( *<np>.*\)*$ are always squeezed down to only a single blank, and only the first occurring paragraph after a match for ^*\** *$ is printed to stdout.and it It prints...

{ grep -xm1 '*\* *' >&2 sed -n '/^\( *<np> .*\)*$/q;p' } <infile 2>/dev/null >outfile 
  • x
    • Swaps hold and pattern spaces. This institutes a look-behind - sed is always one-line behind input - and the first line is always blank.
  • /^\( *<np>.*\)*$/
    • This selects a stop printing before here line that matches from head to tail zero or more occurrences in the match group. Two kinds of lines can match zero or more occurrences of that - either a blank line or one with starting with some number of <<!>spaces<!>> at the head of the line followed by the string <np>.
  • /^*\** *$/
    • This selects a start printing after here line which opens with at least one * asterisk character and continues to the end of the line with only zero or more occurrences of the * asterisk and possibly closed by any number of spaces.
  • c\' -e ''
    • This changes the entire blocked selection to a selection blank line - so any number of occuring paragraphs that fit between the range pattern will be separated by only a blank line.

...and it prints...

{ grep -xm1 '*\* *' >&2 sed -n '/^\( *<np> *\)*$/q;p' } <infile 2>/dev/null >outfile 
  • x
    • Swaps hold and pattern spaces. This institutes a look-behind - sed is always one-line behind input - and the first line is always blank.
  • /^\( *<np>.*\)*$/
    • This selects a stop printing before here line that matches from head to tail zero or more occurrences in the match group. Two kinds of lines can match zero or more occurrences of that - either a blank line or one with any number of <<!>spaces<!>> at the head of the line followed by the string <np>.
  • /^*\** *$/
    • This selects a start printing after here line which opens with at least one * asterisk character and continues to the end of the line with only zero or more occurrences of the * asterisk and possibly closed by any number of spaces.
  • c\' -e ''
    • This changes the entire blocked selection to a single blank line, squeezing all unwanted lines to the string EOF.

So any number of lines occurring before ^*\** *$ and after the first following ^\( *<np>.*\)*$ are always squeezed down to only a single blank, and only the first occurring paragraph after a match for ^*\** *$ is printed to stdout. It prints...

{ grep -xm1 '*\* *' >&2 sed -n '/^\( *<np>.*\)*$/q;p' } <infile 2>/dev/null >outfile 
deleted 162 characters in body
Source Link
mikeserv
  • 59.4k
  • 10
  • 122
  • 242
Loading
added 245 characters in body
Source Link
mikeserv
  • 59.4k
  • 10
  • 122
  • 242
Loading
added 100 characters in body
Source Link
mikeserv
  • 59.4k
  • 10
  • 122
  • 242
Loading
added 100 characters in body
Source Link
mikeserv
  • 59.4k
  • 10
  • 122
  • 242
Loading
deleted 1 character in body
Source Link
mikeserv
  • 59.4k
  • 10
  • 122
  • 242
Loading
added 312 characters in body
Source Link
mikeserv
  • 59.4k
  • 10
  • 122
  • 242
Loading
Source Link
mikeserv
  • 59.4k
  • 10
  • 122
  • 242
Loading