3

I'm doing a parser for build outputs, and I'd like to highlight different patterns in different colors. So for example, I'd like to do:

sed -e "s|\(Error=errcode1\)|<red>\1<_red>|" \ -e "s|\(Error=errcode2\)|<orange>\1<_orange>|" \ -e "s|\(Error=.*\)|<blue>\1<_blue>|" 

(so it higlights errcode1 in red, errcode2 in orange, and anything else in blue). The problem with this is that Error=errcode1 matches both the first and the third expression, which will result in <red><blue>Error=errcode1<_red><_blue>... Is there any way to tell sed to match only the first expression, and if it does, do not try the following expressions?

Note, the sed command will actually be auto-generated from files which will be very volatile, so I'd like a generic solution where I don't have to police whether patterns conflict...

1
  • 1
    Does it have to be with sed? Would you be interested in a solution using Perl? Commented Mar 16, 2018 at 20:47

4 Answers 4

4

Let's start with a simpler example to illustrate the problem. In the code below, both substitutions are performed:

$ echo 'error' | sed 's/error/error2/; s/error/error3/' error32 

If we want to skip the second if the first succeeded, we can use the "test" command which branches if the previous substitution was successful. If we provide no label after t, it branches to the end, skipping all remaining commands:

$ echo 'error' | sed 's/error/error2/; t; s/error/error3/' error2 

Summary

If you want to stop after the first substitution that succeeds, place a t command after each substitution command.

More complex case

Suppose that we want to skip the second but not the third substitution if the first succeeds. In that case, we need to supply a label to the t command:

$ echo 'error' | sed 's/error/error2/; ta; s/error/error3/; :a; s/error/error4/' error42 

In the above, :a defines label a. The command ta branches to label a if the preceeding s command succeeds.

Compatibility

The above code was tested in GNU sed. I am told that BSD sed does not accept ; as a command separator after a label. Thus, on BSD/macOS, try:

echo 'error' | sed -e 's/error/error2/' -e ta -e 's/error/error3/' -e :a -e 's/error/error4/' 
Sign up to request clarification or add additional context in comments.

Comments

0

You can apply boolean logic to matches with |, & and !.

Solution

(not sure if the syntax is compatible with your system so you may need to add more backslashes)

"s|\(Error=\(.*&!errcode1&!errcode2\)\)|<blue>\1<_blue>|" 

Other notes

sed can use any character as a delimiter, so all of the following expressions are equivalent:

"s/foo/bar/" "s:foo:bar:" "s|foo|bar|" "s#foo#bar#" 

Also, if you are using bash on a Unix-based system, you can use shell variables if you're running this from a script (since your patterns are enclosed with " and not ', there's a difference).

PREFIX="Error=" TARGET_1="errorcode1" TARGET_2="errorcode2" SUB_1="<red>\1<_red>" SUB_2="<orange>\1<_orange>" SUB_3="<blue>\1<_blue>" sed -e "s|\($PREFIX$TARGET_1\)|$SUB_1|" \ -e "s|\($PREFIX$TARGET_2\)|$SUB_2|" \ -e "s|\($PREFIX\(.*&!$TARGET_1&!$TARGET_2\)\)|$SUB_3|" \ 

Comments

0

If the other errorcodes follow the naming scheme errcodeN, you can negate the 1,2:

sed -e "s|\(Error=errcode1\)|<red>\1<_red>|" \ -e "s|\(Error=errcode2\)|<orange>\1<_orange>|" \ -e "s|\(Error=errcode[^12]\)|<blue>\1<_blue>|" 

If the codes exceed number 9: [^12]+

2 Comments

How is errorcode[^12] in your answer equivalent to * in the OP's question? You will miss all strings like Error=UserUnknown.
I don't have secret insights into the patterns of the other errorcodes. Maybe errorcode10?
0

This is not a good application for sed, you should use awk instead. You didn't provide any sample input/output to test against so this is obviously untested but you'd do something like this:

awk ' BEGIN { colors["errorcode1"] = "red" colors["errorcode2"] = "orange" colors["default"] = "blue" } match($0,/(.*Error=)([[:alnum:]]+)(.*)/,a) { code = a[2] color = (code in colors ? colors[code] : colors["default"]) $0 = sprintf("%s<%s>%s<_%s>%s", a[1], color, code, color, a[3]) } { print } ' 

The above uses GNU awk for the 3rd arg to match(), it's a minor tweak for other awks.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.