Revisions to Grep multiple patterns and print a different number of lines below each of the patterns?

handle IFS=[-ne] properly.

edited Jul 8, 2015 at 2:51

59.4k
10
122
242

amatch(){ sed ${2:++"-ne"$ne$(n=0; \ while [ "$#" -gt "$((!!(n+=1)))" ]; \ do printf "\n/%s/{:$n\n\t%s;to\n\t%s\n}" \ "$1" "\$p;N;s/\n/&/$2" "b$n"; \ shift 2; \ done; printf "\nd;:o\n\t%s\n\t%s\n\tH;x;p" \ 'h;y/\n-/-\n/' 's/[^-]*//g' \ )"}; }

amatch(){ sed ${2:+-ne"$(n=0; \ while [ "$#" -gt "$((!!(n+=1)))" ]; \ do printf "\n/%s/{:$n\n\t%s;to\n\t%s\n}" \ "$1" "\$p;N;s/\n/&/$2" "b$n"; \ shift 2; \ done; printf "\nd;:o\n\t%s\n\t%s\n\tH;x;p" \ 'h;y/\n-/-\n/' 's/[^-]*//g' \ )"}; }

amatch(){ sed ${2:+"-ne$(n=0; \ while [ "$#" -gt "$((!!(n+=1)))" ]; \ do printf "\n/%s/{:$n\n\t%s;to\n\t%s\n}" \ "$1" "\$p;N;s/\n/&/$2" "b$n"; \ shift 2; \ done; printf "\nd;:o\n\t%s\n\t%s\n\tH;x;p" \ 'h;y/\n-/-\n/' 's/[^-]*//g' \ )"}; }

added 1988 characters in body

Source Link

edited Jul 8, 2015 at 0:13

mikeserv

59.4k
10
122
242

sed -e:1 -e'/pattern1/{$p;N;s/\n/&/[num];to' -eb1 -e\} \ -e:2 -e'/pattern2/{$p;N;s/\n/&/[num];to' -eb2 -e\} \ -ed -e:o -e'h;y/\n-/-\n/' -e's/[^-]*//g;H;x'

sed -ne'/pattern1/{:1' -e'$p;N;s/\n/&/[num]; to' -eb1 -e\} \ -e'/pattern2/{:2' -e'$p;N;s/\n/&/[num]; to' -eb2 -e\} \ -ed -e:o -eh -e'y/\n-/-\n/;s/[^-]*//g' -e'H;x;p'

That will print as many -dashes following a match block as there are newlines which follow that match. And so, if you wanted 4 lines after every pattern1 match and 2 after every pattern2 match, you'd get an additional divider line printed between each block of 4 hyphens at the tail of a pattern1 block and 2 hyphens at the tail of a pattern2 block. This is all true except when the last line is encountered while gathering the tail for each - in that case everything between the last line and the pattern match is printed, but no hyphens are appended.

You might notice that much of the code in the above sed script is fairly redundant. Basically we're just implementing an identical type of loop for each possible match. It is thanks to sed's very simple syntax rules, as is evinced above, that makes scripting sed so eminently scriptable. In other words, because sed's syntax is basic, it is a simple affair to write a script which can write a sed script.

For example, this task might easily be parameterized to work with any number of patterns and associated follow line counts:

amatch(){ sed ${2:+-ne"$(n=0; \ while [ "$#" -gt "$((!!(n+=1)))" ]; \ do printf "\n/%s/{:$n\n\t%s;to\n\t%s\n}" \ "$1" "\$p;N;s/\n/&/$2" "b$n"; \ shift 2; \ done; printf "\nd;:o\n\t%s\n\t%s\n\tH;x;p" \ 'h;y/\n-/-\n/' 's/[^-]*//g' \ )"}; }

So long as amatch() is called with 2 or more parameters, it will build out a sed script just like the one above, and write a loop for each pair.

So it first builds and prints a sed script in a subshell, and afterward sed runs it against stdin.

So when I do:

seq 30 | amatch \[45] 5 1$ 2

The shell's while loop assembles and prints out to the command substitution a script that looks like:

/[45]/{:1 $p;N;s/\n/&/5;to b1 } /1$/{:2 $p;N;s/\n/&/2;to b2 } d;:o h;y/\n-/-\n/ s/[^-]*//g H;x;p

And sed evaluates that against stdin and prints...

1 2 3 -- 4 5 6 7 8 9 ----- 11 12 13 -- 14 15 16 17 18 19 ----- 21 22 23 -- 24 25 26 27 28 29 -----

sed -e:1 -e'/pattern1/{$p;N;s/\n/&/[num];to' -eb1 -e\} \ -e:2 -e'/pattern2/{$p;N;s/\n/&/[num];to' -eb2 -e\} \ -ed -e:o -e'h;y/\n-/-\n/' -e's/[^-]*//g;H;x'

That will print as many -dashes following a match block as there are newlines which follow that match. And so, if you wanted 4 lines after every pattern1 match and 2 after every pattern2 match, you'd get an additional divider line printed between each block of 4 hyphens at the tail of a pattern1 block and 2 hyphens at the tail of a pattern2 block. This is all true except when the last line is encountered while gathering the tail for each - in that case everything between the last line and the pattern match is printed, but no hyphens are appended.

sed -ne'/pattern1/{:1' -e'$p;N;s/\n/&/[num]; to' -eb1 -e\} \ -e'/pattern2/{:2' -e'$p;N;s/\n/&/[num]; to' -eb2 -e\} \ -ed -e:o -eh -e'y/\n-/-\n/;s/[^-]*//g' -e'H;x;p'

That will print as many -dashes following a match block as there are newlines which follow that match. And so, if you wanted 4 lines after every pattern1 match and 2 after every pattern2 match, you'd get an additional divider line printed between each block of 4 hyphens at the tail of a pattern1 block and 2 hyphens at the tail of a pattern2 block. This is all true except when the last line is encountered while gathering the tail for each - in that case everything between the last line and the pattern match is printed, but no hyphens are appended.

You might notice that much of the code in the above sed script is fairly redundant. Basically we're just implementing an identical type of loop for each possible match. It is thanks to sed's very simple syntax rules, as is evinced above, that makes scripting sed so eminently scriptable. In other words, because sed's syntax is basic, it is a simple affair to write a script which can write a sed script.

For example, this task might easily be parameterized to work with any number of patterns and associated follow line counts:

amatch(){ sed ${2:+-ne"$(n=0; \ while [ "$#" -gt "$((!!(n+=1)))" ]; \ do printf "\n/%s/{:$n\n\t%s;to\n\t%s\n}" \ "$1" "\$p;N;s/\n/&/$2" "b$n"; \ shift 2; \ done; printf "\nd;:o\n\t%s\n\t%s\n\tH;x;p" \ 'h;y/\n-/-\n/' 's/[^-]*//g' \ )"}; }

So long as amatch() is called with 2 or more parameters, it will build out a sed script just like the one above, and write a loop for each pair.

So it first builds and prints a sed script in a subshell, and afterward sed runs it against stdin.

So when I do:

seq 30 | amatch \[45] 5 1$ 2

The shell's while loop assembles and prints out to the command substitution a script that looks like:

/[45]/{:1 $p;N;s/\n/&/5;to b1 } /1$/{:2 $p;N;s/\n/&/2;to b2 } d;:o h;y/\n-/-\n/ s/[^-]*//g H;x;p

And sed evaluates that against stdin and prints...

1 2 3 -- 4 5 6 7 8 9 ----- 11 12 13 -- 14 15 16 17 18 19 ----- 21 22 23 -- 24 25 26 27 28 29 -----

handle a pattern2 occurrence during a pattern1 buffer gathering even if pattern2 wants fewer newlines than pattern1

Source Link

edited Jul 7, 2015 at 20:41

mikeserv

59.4k
10
122
242

sed -e:n1 -e'/pattern1/{$p;N;s/\n/&/[num];to' -eb1 -e\} \   -e:2 -e'/pattern2/{$p;N;s/\n/&/[num];to' -eb2 -e\} \  -e'/\n/!d;bn'ed -e:o -e'h;y/\n-/-\n/' \  -e's/[^-]*//g;H;x'

That will print as many -dashes following a match block as there are newlines which follow that match. And so, if you wanted 4 lines after every pattern1 match and 2 after every pattern2 match, you'd get an additional divider line printed between each block of 4 hyphens at the tail of a pattern1 block and 2 hyphens at the tail of a pattern2 block. This is all true except when the last line is encountered while gathering the tail for each - in that case everything between the last line and the pattern match is printed, but no hyphens are appended.

sed -e:n -e'/pattern1/{$p;N;s/\n/&/[num];to' -e\} \   -e'/pattern2/{$p;N;s/\n/&/[num];to' -e\} \  -e'/\n/!d;bn' -e:o -e'h;y/\n-/-\n/' \  -e's/[^-]*//g;H;x'

That will print as many -dashes following a match block as there are newlines which follow that match. And so, if you wanted 4 lines after every pattern1 match and 2 after every pattern2 match, you'd get an additional divider line printed between each block of 4 hyphens at the tail of a pattern1 block and 2 hyphens at the tail of a pattern2 block. This is all true except when the last line is encountered while gathering the tail for each - in that case everything between the last line and the pattern match is printed, but no hyphens are appended.

sed -e:1 -e'/pattern1/{$p;N;s/\n/&/[num];to' -eb1 -e\} \ -e:2 -e'/pattern2/{$p;N;s/\n/&/[num];to' -eb2 -e\} \ -ed -e:o -e'h;y/\n-/-\n/' -e's/[^-]*//g;H;x'

That will print as many -dashes following a match block as there are newlines which follow that match. And so, if you wanted 4 lines after every pattern1 match and 2 after every pattern2 match, you'd get an additional divider line printed between each block of 4 hyphens at the tail of a pattern1 block and 2 hyphens at the tail of a pattern2 block. This is all true except when the last line is encountered while gathering the tail for each - in that case everything between the last line and the pattern match is printed, but no hyphens are appended.