I have tried this
sed -i '' 's/[0-9]*<>/g' But it didn't work.
Example file:
<Number1> </Number8> output:
<Number> </Number> This is really easy to do with sed, actually. You just get as many as you can in one go, then try, try again:
sed -e :t -e 's/\(<[^<]*\)[0-9]\{1,\}\([^>]*>\)/\1\2/g;tt' I tried it with the following random bits of input:
<Number1> 234234 </Nu994845mb6er8>' 234234 <000000000000000000000000000000000000>> <a1> 2 <34b5c> 6 7 def And the results were:
<Number> 234234 </Number> 234234 <>> <a> 2 <bc> 6 7 def The regex just matches at least one number between a < and a >. It continues to replace that number sequence with nothing at all until it can no longer successfully do so. This is the purpose of the test command.
Else you can do it without a loop like:
sed 's/^/>/;s/\(>[^<>]*\)*[0-9]*/\1/g;s/.//' <<\INPUT <Number1> 234234 </Nu994845mb6er8>' 234234 <000000000000000000000000000000000000>> <a1> 2 <34b5c> 6 7 def INPUT <Number> 234234 </Number>' 234234 <>> <a> 2 <bc> 6 7 def It will always skip any > until it encounters a < - so it only affects <[^<>]*> groups. See this if you're interested in why.
The following works:
sed -i 's/\(<[^0-9>]*\)[0-9]*\([^0-9]*>\)/\1\2/g' filename sed 's/\(<[^0-9]*\)[0-9]*\([^0-9]*>\)/\1\2/g' <<< "<sss>asa1</sss>" <...>. For <a1b>, it works, for <a1b2>, it doesn't. You need a loop within sed, if you want to handle the latter case. sed - that's just one way. See my edited answer for an example of doing it another way., You either need a loop around a substitution command (possible in both sed and perl), or a nested substitution command (perl only). I prefer the latter approach; it's a bit more general:
perl -pe 's/\<([^>]*)\>/do{$a = $1; $a =~ s,\d,,g; "\<" . $a . "\>"}/ge;' Example input:
<a1> 2 <34b5c> 6 7 def Output:
<a> 2 <bc> 6 7 def Explanation: The -p option says that we want to read the file line by line, execute the script for each line, and print the result (like in sed); -e means that the next argument is the script to be executed.
Essentially, the script is just a substitution command: We look for <, followed by any number of non->-characters, followed by >. The e modifier after the trailing / indicates a special feature of the substitution command: Its replacement part is not a string to be printed, but again a command sequence to be executed. In this command sequence, we first assign the string between < and > (i.e., $1) to a new variable $a, then execute another substitution command on $a that simply replaces every digit (\d) by nothing, and finally return <, followed by the modified string, followed by >. The g modifier (both after the trailing / and the trailing ,) means that the substitution commands should be executed for every matching string, not just for the first one.
If the opening < and the corresponding > can be in different lines, say,
<abc1 opt="def"> add the option -0777 (i.e., perl -0777 -pe '...'), so that perl reads the entire file before processing it instead of working line-by-line (slurp mode).
script.pl and then run script.pl inputfile?. I am not that experienced with perl. -i option as in sed, i.e., perl -i -pe 'PERLCOMMANDS' inputfile. Without the -i option, the modified contents are written to standard output. < and > have to be on the same line. sed... short sed way
sed 's/<\([^>]\+\)[0-9]\+>/<\1>/g' file < > span line boundaries, in fact. I guess, on second thought, you make a very good point.
<and>follow XML formatting? eg no nesting<<>>, and no un-matched<or>