272

I have some script that produces output with colors and I need to remove the ANSI codes.

#!/bin/bash exec > >(tee log) # redirect the output to a file but keep it on stdout exec 2>&1 ./somescript 

The output is (in log file):

java (pid 12321) is running...@[60G[@[0;32m OK @[0;39m] 

I didn't know how to put the ESC character here, so I put @ in its place.

I changed the script into:

#!/bin/bash exec > >(tee log) # redirect the output to a file but keep it on stdout exec 2>&1 ./somescript | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g" 

But now it gives me (in log file):

java (pid 12321) is running...@[60G[ OK ] 

How can I also remove this '@[60G?

Maybe there is a way to completely disable coloring for the entire script?

2

22 Answers 22

268

According to Wikipedia, the [m|K] in the sed command you're using is specifically designed to handle m (the color command) and K (the "erase part of line" command). Your script is trying to set absolute cursor position to 60 (^[[60G) to get all the OKs in a line, which your sed line doesn't cover.

(Properly, [m|K] should probably be (m|K) or [mK], because you're not trying to match a pipe character. But that's not important right now.)

If you switch that final match in your command to [mGK] or (m|G|K), you should be able to catch that extra control sequence.

./somescript | sed -r "s/\x1B\[([0-9]{1,3}(;[0-9]{1,2};?)?)?[mGK]//g" 
Sign up to request clarification or add additional context in comments.

12 Comments

BSD/OSX users: We usually don't have the -r option to sed. brew install gnu-sed will install a capable version. Run with gsed.
@edi9999 As far as I can tell, the difference there is that color settings beyond 16 colors (as setaf supports) require more parameters than just two; my regex supports two. Changing the first ? out for * should help. Handling sgr0 is possible but based on a search it likely grows outside the scope of this hacky regex-based answer.
This doesn't work reliably as there can be a third value (ala [38;5;45m). This alternative answer works unix.stackexchange.com/a/55547/168277
Adding this as alias decolorize='sed -r "s/\\x1B\\[([0-9]{1,3}(;[0-9]{1,2})?)?[mGK]//g"' to your bashrc creates a very nice utility that can be used both as command | decolorize and as decolorize file.log.
More correct form is sed -r "s/\x1B\[([0-9]{1,3}(;[0-9]{1,3})*)?[mGK]//g", because there can be more terms and they can be also 3 digits long. E.g. \e[38;5;123m or even \e[38;5;123;48;5;246m.
|
116

IMHO, most of these answers try too hard to restrict what is inside the escape code. As a result, they end up missing common codes like [38;5;60m (foreground ANSI color 60 from 256-color mode).

They also require the -r option which enables GNU extensions. These are not required; they just make the regex read better.

Here is a simpler answer that handles the 256-color escapes and works on systems with non-GNU sed:

./somescript | sed 's/\x1B\[[0-9;]\{1,\}[A-Za-z]//g' 

This will catch anything that starts with [, has any number of decimals and semicolons, and ends with a letter. This should catch any of the common ANSI escape sequences.

For funsies, here's a larger and more general (but minimally tested) solution for all conceivable ANSI escape sequences:

./somescript | sed 's/\x1B[@A-Z\\\]^_]\|\x1B\[[0-9:;<=>?]*[-!"#$%&'"'"'()*+,.\/]*[][\\@A-Z^_`a-z{|}~]//g' 

(and if you have @edi9999's SI problem, add | sed "s/\x0f//g" to the end; this works for any control char by replacing 0f with the hex of the undesired char)

9 Comments

This one worked nicely to string colour out of Azure az cli prettified output.
Fixed @elig. Turns out it had a number of issues, starting with some editor replacing all my dashes with weird unicode versions, but also a bunch of improper escaping - | in sed, ] inside a character class in sed, and ' in a single-quoted bash string. It is now working for me for a very basic test case.
I think there might be an error with the first regex - \+ will make the plus sign a literal, but I think it is mean to be an "at least one" modifier of the previous range.
@halfer, when using sed without the -r option, + is treated as a literal and \+ is treated as a modifier, in contradiction to most modern usage.
This is my preferred answer, but for my use case there was a small problem, the output I was processing contained ^[[m which was not being caught. Resolved by amending like so ./somescript | sed 's/\x1B\[[0-9;]*[A-Za-z]//g'
|
103

I came across ansi2txt tool from colorized-logs package in Debian. The tool drops ANSI control codes from STDIN.

Usage example:

./somescript | ansi2txt 

Source code http://github.com/kilobyte/colorized-logs

5 Comments

The package 'colorized-logs' exists in Ubuntu's standard repository, which makes the installation of this tool very easy with: 'sudo apt install colorized-logs'. It works great and I have yet to find a problem with it. This should be the official answer.
Likewise for AUR folks: aur/colorized-logs
From the manual page: All ANSI codes are simply ignored, including all cursor positioning ones. This package removes all colorized parts, e.g. timestamp which makes it pretty useless.
It seems that one can also install with pip install --user ansi2txt or with pip install ansi2txt.
Arch Linux's version is pulled from here: github.com/gabe565/ansi2txt (aur.archlinux.org/packages/ansi2txt-bin)
52

I couldn't get decent results from any of the other answers, but the following worked for me:

somescript | sed -r "s/[[:cntrl:]]\[[0-9]{1,3}m//g" 

If I only removed the control char "^[", it left the rest of the color data, e.g., "33m". Including the color code and "m" did the trick. I'm puzzled with s/\x1B//g doesn't work because \x1B[31m certainly works with echo.

6 Comments

On OSX (BSD sed), use -E instead of -r for extended regex. More could be found here
i had to replace {1,3} to {,3} (otherwise it was still skipping some controls), thanks for your solution!
Since they might be multiple numbers separated with semi-colons (for background color, bold, italic, etc...). This command worked for me: sed -r "s/[[:cntrl:]]\[([0-9]{1,3};)*[0-9]{1,3}m//g"
This one (of the many I tested) worked with Ansible output that had been run with unbuffer.
for those that want to view logs that contain color codes using the less command this worked for me on ubuntu. cat errors.log | sed -r "s/[[:cntrl:]]\[[0-9]{1,3}m//g" | tee errors-copy.log | less errors-copy.log
|
50

For Mac OSX or BSD use

./somescript | sed $'s,\x1b\\[[0-9;]*[a-zA-Z],,g' 

5 Comments

Strange, this one worked fine for debian but other above didn't.
This one partially worked. However, if I open a file in excel, I'm still seeing this special character "?" at the end of each line.
@doudy_05 Try to pass -E flag for sed to enable extended regexp.
Works for Raspbian (partly, I can see ^M) and MacOs Big Sur (completely). Thanks, saved my time.
This one works for Busybox (Asus router) while others don't.
35

The regular expression below will miss some ANSI Escape Codes sequences, as well as 3 digit colors. Example and Fix on regex101.com.

Use this instead:

./somescript | sed -r 's/\x1B\[(;?[0-9]{1,3})+[mGK]//g' 

I also had the problem that sometimes, the SI character appeared.

It happened for example with this input : echo "$(tput setaf 1)foo$(tput sgr0) bar"

Here's a way to also strip the SI character (shift in) (0x0f)

./somescript | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" | sed "s/\x0f//g" 

2 Comments

Not sure why this answer receives so little credit. This is the only one working for me...
This one is close to working, but it misses the three digit case and sequences of color codes like: U+001B[38;2;128;128;128m. See the unfound colors at regex101.com/r/Qjtopi/1. Regex that works for me can be found at regex101.com/r/wYygBw/1
14

There's also a dedicated tool to handle ANSI escape sequences: ansifilter. Use the default --text output format to strip all ANSI escape sequences (note: not just coloring).

ref: https://stackoverflow.com/a/6534712

2 Comments

in case you like to build it yourself, the source is at gitlab.com/saalen/ansifilter
brew install ansifilter
12

Not sure what's in ./somescript but if escape sequences are not hardcoded you can set the terminal type to avoid them

TERM=dumb ./somescript 

For example, if you try

TERM=dumb tput sgr0 | xxd 

you'll see it produces no output while

tput sgr0 | xxd 00000000: 1b28 421b 5b6d .(B.[m 

does (for xterm-256color).

6 Comments

Simplest solution/answer by far!
Assuming your tools pay attention to the terminal type. In general, well behaved tools already have a switch to disable ANSI sequences (usually badly named like --no-color)
@sehe that's why I mentioned "but if escape sequences are not hardcoded..."
I upvoted, but in the general case this is not easy to tell. I suppose it will be a minority of tools that use tput, and a somehwat wider set of tools that tries to be smart (or even sometimes correct) about terminal capabilities.
TERM=dumb folks
|
12

Much simpler function in pure Bash to filter-out common ANSI codes from a text stream:

# Strips common ANSI codes from a text stream shopt -s extglob # Enable Bash Extended Globbing expressions ansi_filter() { local line local IFS= while read -r line || [[ "$line" ]]; do printf '%s\n' "${line//$'\e'[\[(]*([0-9;])[@-n]/}" done } 

See:

  1. linuxjournal.com: Extended Globbing
  2. gnu.org: Bash Parameter Expansion

6 Comments

This doesn’t work. Test with tldr. (Though I use zsh so it might also be because of that.)
Indeed, Zsh will not understand Bash’s extended globing extglob or probably neither will it understand string replacement altogether.
I did enable the extendedglob of zsh ... String replacement should be posix, too?
String replacement is not POSIX. You can use any of the alternate methods using sed mentioned here that will work with Zsh.
This solution has the advantage of line-buffering the text. I tried with sed but it was block-buffering my pipe.
|
9

I had a similar problem. All solutions I found did work well for the color codes but did not remove the characters added by "$(tput sgr0)" (resetting attributes).

Taking, for example, the solution in the comment by davemyron the length of the resulting string in the example below is 9, not 6:

#!/usr/bin/env bash string="$(tput setaf 9)foobar$(tput sgr0)" string_sed="$( sed -r "s/\x1B\[[0-9;]*[JKmsu]//g" <<< "${string}" )" echo ${#string_sed} 

In order to work properly, the regex had to be extend to also match the sequence added by sgr0 ("\E(B"):

string_sed="$( sed -r "s/\x1B(\[[0-9;]*[JKmsu]|\(B)//g" <<< "${string}" )" 

1 Comment

@Jarodiv - thanks for the most comprehansive approach. All the answers provided on this topic deal ONLY with ANSI/VT100 Control sequences (ex: "\e[31mHello World\e[0m"), however do not remediate anything caused by TPUT text formatting (ex: tput smso/tput setaf X/tput rmso/tput sgr0). As a result after all 'sed' executions there was some other mess remaining in the logs. This is a pure solution to my usecases!
6

I spent little bit searching around for a solution that didn't involve memorizing sed spells, and have not seen this suggested anywhere:

./somescript | tr -dc '[[:print:][:space:]]' 

removes any character that is not either printable or whitespace (including vertical).

Comments

5

Here's a pure Bash solution.

Save as strip-escape-codes.sh, make executable and then run <command-producing-colorful-output> | ./strip-escape-codes.sh.

Note that this strips all ANSI escape codes/sequences. If you want to strip colors only, replace [a-zA-Z] with "m".

Bash >= 4.0:

#!/usr/bin/env bash # Strip ANSI escape codes/sequences [$1: input string, $2: target variable] function strip_escape_codes() { local _input="$1" _i _char _escape=0 local -n _output="$2"; _output="" for (( _i=0; _i < ${#_input}; _i++ )); do _char="${_input:_i:1}" if (( ${_escape} == 1 )); then if [[ "${_char}" == [a-zA-Z] ]]; then _escape=0 fi continue fi if [[ "${_char}" == $'\e' ]]; then _escape=1 continue fi _output+="${_char}" done } while read -r line; do strip_escape_codes "${line}" line_stripped echo "${line_stripped}" done 

Bash < 4.0:

#!/usr/bin/env bash # Strip ANSI escape codes/sequences [$1: input string, $2: target variable] function strip_escape_codes() { local input="${1//\"/\\\"}" output="" i char escape=0 for (( i=0; i < ${#input}; ++i )); do # process all characters of input string char="${input:i:1}" # get current character from input string if (( ${escape} == 1 )); then # if we're currently within an escape sequence, check if if [[ "${char}" == [a-zA-Z] ]]; then # end is reached, i.e. if current character is a letter escape=0 # end reached, we're no longer within an escape sequence fi continue # skip current character, i.e. do not add to ouput fi if [[ "${char}" == $'\e' ]]; then # if current character is '\e', we've reached the start escape=1 # of an escape sequence -> set flag continue # skip current character, i.e. do not add to ouput fi output+="${char}" # add current character to output done eval "$2=\"${output}\"" # assign output to target variable } while read -r line; do strip_escape_codes "${line}" line_stripped echo "${line_stripped}" done 

1 Comment

Well, this solution could be even less complicated.
3

The controversial idea would be to reconfigure terminal settings for this process environment to let the process know that terminal does not support colors.

Something like TERM=xterm-mono ./somescript comes to my mind. YMMV with your specific OS and ability of your script to understand terminal color settings.

1 Comment

I used your solution and it worked. However, I believe you may need to set if to something other than xterm*, at least it didn't work for me until I set TERM to anything that didn't start with xterm. In my case: TERM= ./my_script worked like a charm.
3

The other answers didn't quite manage to strip all escape codes (e.g. \x1b[?25l).

This little sed command should do the trick:

./somescript | sed -r 's/[\x1B\x9B][][()#;?]*(([a-zA-Z0-9;]*\x07)|([0-9;]*[0-9A-PRZcf-ntqry=><~]))//g' 

The regex is a modification of https://github.com/acarl005/stripansi/blob/master/stripansi.go#L7

Comments

3

@jeff-bowman's solution helped me getting rid of SOME of the color codes. I added another small portion to the regex in order to remove some more:

sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" # Original. Removed Red ([31;40m[1m[error][0m) sed -r "s/\x1B\[([0-9];)?([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" # With an addition, removed yellow and green ([1;33;40m[1m[warning][0m and [1;32;40m[1m[ok][0m) ^^^^^^^^^ remove Yellow and Green (and maybe more colors) 

Comments

2

Hmm, not sure if this will work for you, but 'tr' will 'strip' (delete) control codes - try:

./somescript | tr -d '[:cntrl:]' 

6 Comments

Suddenly it's also remove new lines
Yes, LF and CR (codes) are control codes; if your are interested in more than one line then this may not be a solution. Since it appears that you are running a JAVA program I will guess that the colors are managed from there; Otherwise you would need to look at your console setup (i.e. terminal settings/color scheme) and/or at the options for each command that supports 'colors', i.e. ls --color=never
I like this answer for its elegance, even if it does more than just removing colours. Thanks!
it actually let codes there, see ls -l + your command: rwxr-xr-x 1 tokra admin 22 Oct 18 14:21 [0m[01;36m/usr/local/opt/gradle[0m -> [01;34m../Cellar/gradle/4.2.1[0m/
Control codes are not ANSI codes. This does not answer the question at all.
|
2

I had some issues with colorized output which the other solutions here didn't process correctly, so I built this perl one liner. It looks for escape \e followed by opening bracket \[ followed by one or color codes \d+ separated by semicolons, ending on m.

perl -ple 's/\e\[\d+(;\d+)*m//g' 

It seems to work really well for colorized compiler output.

Comments

0

I came across this question/answers trying to do something similar as the OP. I found some other useful resources and came up with a log script based on those. Posting here in case it can help others.

Digging into the links helps understand some of the redirection which I won't try and explain because I'm just starting to understand it myself.

Usage will render the colorized output to the console, while stripping the color codes out of the text going to the log file. It will also include stderr in the logfile for any commands that don't work.

Edit: adding more usage at bottom to show how to log in different ways

#!/bin/bash set -e DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )" . $DIR/dev.conf . $DIR/colors.cfg filename=$(basename ${BASH_SOURCE[0]}) # remove extension # filename=`echo $filename | grep -oP '.*?(?=\.)'` filename=`echo $filename | awk -F\. '{print $1}'` log=$DIR/logs/$filename-$target if [ -f $log ]; then cp $log "$log.bak" fi exec 3>&1 4>&2 trap 'exec 2>&4 1>&3' 0 1 2 3 exec 1>$log 2>&1 # log message log(){ local m="$@" echo -e "*** ${m} ***" >&3 echo "=================================================================================" >&3 local r="$@" echo "=================================================================================" echo -e "*** $r ***" | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" echo "=================================================================================" } echo "=================================================================================" >&3 log "${Cyan}The ${Yellow}${COMPOSE_PROJECT_NAME} ${filename} ${Cyan}script has been executed${NC}" log $(ls) #log $(<command>) log "${Green}Apply tag to image $source with version $version${NC}" # log $(exec docker tag $source $target 3>&2) #prints error only to console # log $(docker tag $source $target 2>&1) #prints error to both but doesn't exit on fail log $(docker tag $source $target 2>&1) && exit $? #prints error to both AND exits on fail # docker tag $source $target 2>&1 | tee $log # prints gibberish to log echo $? # prints 0 because log function was successful log "${Purple}Push $target to acr${NC}" 

Here are the other links that helped:

Comments

0

I used perl as I have to do this frequently on many files. This will go through all files with filename*.txt and will remove any formatting. This works for my use case and may be useful for someone else too so just thought of posting here. replace whatever your file name is in place of filename*.txt or you can put file names separated by spaces in setting the FILENAME variable below.

$ FILENAME=$(ls filename*.txt) ; for file in $(echo $FILENAME); do echo $file; cat $file | perl -pe 's/\e([^\[\]]|\[.*?[a-zA-Z]|\].*?\a)//g' | col -b > $file-new; mv $file-new $file; done 

Comments

0

my contribution:

./somescript | sed -r "s/\\x1B[\\x5d\[]([0-9]{1,3}(;[0-9]{1,3})?(;[0-9]{1,3})?)?[mGK]?//g" 

Comments

-3

for macOS

$ my_program | pbcopy && pbpaste 

1 Comment

What makes you think that pbcopy removes escape sequences for colour codes from the input?
-9

This works for me:

./somescript | cat 

2 Comments

That depends on how somescript is implemented. It may or may not recognise that its standard output is a tty. (The words offenders actually hard-code terminal-specific escape codes into the program, and break horribly when used on other terminals or in scripts).
Thanks Toby. I used django's manage.py to test, but what you said makes sense.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.