sed - how to remove all lines that do not match

Question

I have a html file. I want to remove all lines that do not start with <tr>.

I tried:

cat my_file | sed $' s/^[^tr].*// ' | sed '/^$/d'

but it deleted all the lines.

s/^[^tr]... matches lines that start with any character other than t or r. Square brackets are a character-range in a regex. — Peter Cordes
– Peter Cordes, Commented Aug 18, 2015 at 16:42

Cyrus · Accepted Answer · 2015-08-18 08:52:37Z

57

Try this with GNU sed:

sed -n '/^<tr>/p' file

or

sed '/^<tr>/!d' file

answered Aug 18, 2015 at 8:52

Cyrus

12.8k3 gold badges32 silver badges55 bronze badges

4

I find the version with !d particularly useful because it enables you to write another sed command within the expression, whereas the p only prints the match, but next command has the input unchaged.

jirislav
– jirislav

2019-10-12 19:52:21 +00:00
Commented Oct 12, 2019 at 19:52
Why I can't use -i option to write to the same file?

DarkSkull
– DarkSkull

2022-09-27 13:33:57 +00:00
Commented Sep 27, 2022 at 13:33
2

@DarkSkull: I recommend a look at the documentation here because this is where GNU sed and BSD sed differ.

Cyrus
– Cyrus

2022-09-27 19:40:14 +00:00
Commented Sep 27, 2022 at 19:40

Add a comment |

user3188445 · Accepted Answer · 2015-08-18 08:52:57Z

sed -e '/^<tr>/d'

The part between / is a regex. The d command deletes matching lines.

Update: oops, sorry I saw you said NOT. So

sed -e '/^<tr>/!d'

Where ! negates the sense of the match.

chaos · Accepted Answer · 2015-08-18 09:33:24Z

If it has to be sed:

sed -ni '/^<tr>/p' file

-i edits the file in-place, -n prevents sed to print all lines, the regular expression means to match all line that start (^) with <tr> and those lines will be printed (p).

With grep:

grep -E '^<tr>' file

With -E grep interprets extended regular expressions.

With awk:

awk '/^<tr>/' file

Or pure bash:

while IFS= read -r l; do [[ "$l" =~ ^\<tr\> ]] && echo $l; done <file

The [[ is bashs internal conditional expression. We compare $l against the regular expression and if it succeded (&&) we print the line with echo.

Your pure-bash version fails to quote "$l". And you're putting it as the first argument on echo's command line, so you'll have a problem if it starts with a -option. (Use printf '%s\n' "$l"). Also, shell read` has to read one-byte-at-a-time, so it's super slow. Processing text files in pure bash is usually not a good choice unless you know your file is very small. — Peter Cordes
– Peter Cordes, Commented Aug 18, 2015 at 16:51

VaTo · Accepted Answer · 2015-08-18 16:37:05Z

Easiest and simplest answer would be:

grep '^<tr>' path/to/file

This will print out the file with only the lines that start with which could be good if you don't want to modify the file directly (like with sed).

Then, if you like what you see in the output you can just print out to a file with > file

In this case you save some time backing up your file before trying some commands.

Stack Exchange Network

sed - how to remove all lines that do not match

4 Answers 4

You must log in to answer this question.

Hot Network Questions

sed - how to remove all lines that do not match

4 Answers 4

You must log in to answer this question.

Related

Hot Network Questions