49

Can I use echo to generate an UTF-8 text file? For example if I want to generate a file that contains the character ę:

echo "abcd ę" > out.txt 

(the batch file is encoded with UTF-8)

the result is an ANSI encoded file and the ę character is transformed into ê. How can I convince echo to generate an UTF-8 file?

If it's not possible, then can I change the encoding of the text file after creating it? Is there any tool in the gnuwin32 package that can help me to change the encoding?

thanks

0

6 Answers 6

85

Use chcp command to change active code page to 65001 for utf-8.

chcp 65001 
Sign up to request clarification or add additional context in comments.

Comments

12

Try starting CMD.exe with the /U switch: it causes all pipe output to be Unicode instead of ANSI.

Comments

6

chcp 65001

as mention by @cuixiping is a good answer but it require to change cmd default font to Lucida Console for example, as you can read here: https://superuser.com/questions/237081/whats-the-code-page-of-utf-8#272184

and of course, as mentioned by @BearCode, the text should be in utf-8… in my case, with Vim under GNU/Linux with remote access, but notepad++ is right way too!

Comments

3

The problem was that the file contained the line:

<META content="text/html; charset=iso-8859-2" http-equiv=Content-Type> 

and then Notepad2 and Firefox was changing the charset, showing Ä instead of ę. In plain Notepad, the file looks ok. The solution was to add the UTF-8 signature (Byte Order Mark) at the beginning of the file:

echo1 -ne \xEF\xBB\xBF > out.htm 

(echo1 is from gnuwin32)

thanks for the answers

1 Comment

Technically, that's an invalid file then. If you add the byte order mark (which is a good way to do this), you should change charset to "charset=utf-8".
0

Appears as well as changing the code page you need to write at least one unicode character in your first echo out to the file for the file to be saved as unicode. So your batch file itself needs to be stored in a unicode format like UTF-8.

1 Comment

an ASCII file is a valid UTF-8 file. Notepad++ or others may not tell you it's UTF-8, but it is. They are in fact looking for non ASCII characters to guess the actual encoding if their is no UTF BOM header in your file, but don't get fooled by what they say, ASCII is fully UTF-8 compatible. (but don't mix up ASCII and US EXTENDED ASCII)
0
I'm not sure if this is the answer you are looking for or if it's already been answered for you... I'd use the catet character ( ^ ) in a batch file and output to a file using escape character ^. See examples.. Desired output... <META content="text/html; charset=iso-8859-2" http-equiv=Content-Type> Replace code with this: Example 1: echo ^<META content="text/html; charset=iso-8859-2" http-equiv=Content-Type^> Example 2: echo ^<?xml version="1.0" encoding="utf-8" ?^> 

1 Comment

If you are not sure, if your answer is relevant to the question, you should read the question before. And no, it's absolutely not an answer to the question

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.