7
$\begingroup$

The default setting "UTF8"/"UTF-8" results in a text file without a byte order mark (BOM)/signature. What if I need a text file in UTF-8 with BOM, and don't want to use a third-party tool to modify the encoding?

$\endgroup$

1 Answer 1

8
$\begingroup$

I tried for a while, but I failed to find a built-in option to achieve this. (It's a bit strange to me, because the Import function clearly knows how to handle text file in UTF-8 with BOM.) So I decided to code my own.

As mentioned in e.g. this post:

A BOM-ed UTF-8 string will start with the three following bytes. EF BB BF

So we just need to add these three bytes and export:

exportUTF8BOM[dir_, text_] := With[{bw = BinaryWrite[dir, #] &}, bw@{16^^ef, 16^^bb, 16^^bf}; bw@ToCharacterCode[text, "UTF8"]; Close@dir] 

Test:

exportUTF8BOM["tst.txt", "太阳当空照,花儿对我笑"] // SystemOpen 

Enter image description here

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.