I have a html file. I want to remove all lines that do not start with <tr>.
I tried:
cat my_file | sed $' s/^[^tr].*// ' | sed '/^$/d' but it deleted all the lines.
Try this with GNU sed:
sed -n '/^<tr>/p' file or
sed '/^<tr>/!d' file !d particularly useful because it enables you to write another sed command within the expression, whereas the p only prints the match, but next command has the input unchaged. -i option to write to the same file? sed -e '/^<tr>/d' The part between / is a regex. The d command deletes matching lines.
Update: oops, sorry I saw you said NOT. So
sed -e '/^<tr>/!d' Where ! negates the sense of the match.
If it has to be sed:
sed -ni '/^<tr>/p' file -i edits the file in-place, -n prevents sed to print all lines, the regular expression means to match all line that start (^) with <tr> and those lines will be printed (p).
With grep:
grep -E '^<tr>' file With -E grep interprets extended regular expressions.
With awk:
awk '/^<tr>/' file Or pure bash:
while IFS= read -r l; do [[ "$l" =~ ^\<tr\> ]] && echo $l; done <file The [[ is bashs internal conditional expression. We compare $l against the regular expression and if it succeded (&&) we print the line with echo.
"$l". And you're putting it as the first argument on echo's command line, so you'll have a problem if it starts with a -option. (Use printf '%s\n' "$l"). Also, shell read` has to read one-byte-at-a-time, so it's super slow. Processing text files in pure bash is usually not a good choice unless you know your file is very small. Easiest and simplest answer would be:
grep '^<tr>' path/to/file This will print out the file with only the lines that start with which could be good if you don't want to modify the file directly (like with sed).
Then, if you like what you see in the output you can just print out to a file with > file
In this case you save some time backing up your file before trying some commands.
grep.s/^[^tr]...matches lines that start with any character other thantorr. Square brackets are a character-range in a regex.