Encoding problems in Python - 'ascii' codec can't encode character '\xe3' when using UTF-8

Question

I've created a program to print out some html content. My source file is in utf-8, the server's terminal is in utf-8, and I also use:

out = out.encode('utf8')

to make sure, the character chain is in utf8. Despite all that, when I use some characters like "ã", "é" in the string out, I get:

UnicodeEncodeError: 'ascii' codec can't encode character '\xe3' in position 84: ordinal not in range(128)

It seems to me that the print after:

print("Content-Type: text/html; charset=utf-8 \n\n")

It's being forced to use ASCII encoding... But, I just don't know this would be the case.

Don't randomly call functions "to make sure". Please show the full context of that line, and the full traceback. — Daniel Roseman
– Daniel Roseman, Commented Jun 16, 2015 at 11:16
Can you show a small reproducible example? - a program that is your program without the lines that do not produce the error? — User
– User, Commented Jun 16, 2015 at 11:20

Brian Tompsett - 汤莱恩 · Accepted Answer · 2015-06-25 20:56:49Z

Thanks a lot.

Here it goes how I've solved the encoding problem in with Python 3.4.1: First I've inserted this line in the code to check the output encoding:

print(sys.stdout.encoding)

And I saw that the output encoding was:

ANSI_X3.4-1968 -

which stands for ASCII and doesn't support characters like 'ã', 'é', etc.

so, I've deleted the previous line, and inserted theses ones here to change the standard output encoding with theses lines

import codecs if sys.stdout.encoding != 'UTF-8': sys.stdout = codecs.getwriter('utf-8')(sys.stdout.buffer, 'strict') if sys.stderr.encoding != 'UTF-8': sys.stderr = codecs.getwriter('utf-8')(sys.stderr.buffer, 'strict')

Here is where I found the information:

http://www.macfreek.nl/memory/Encoding_of_Python_stdout

P.S.: everybody says it's not a good practice to change the default encoding. I really don't know about it. In my case it has worked fine for me, but I'm building a very small and simple webapp.

hspandher · Accepted Answer · 2015-06-16 11:51:37Z

3

I guess you should read the file as unicode object, that way you might not need to encode it.

import codecs file = codecs.open('file.html', 'w', 'utf-8')

answered Jun 16, 2015 at 11:51

hspandher

16.8k2 gold badges35 silver badges49 bronze badges

2 Comments

Alexandre Cavalcante Over a year ago

Yes I read all the files using the procedure you described... The problem was the standard out. It was set to ANSI_X3.4-1968 - So even I've read all the files as utf-8, when I used the print to generate the hmtl, it was being convert to ascii. Thank a lot anyway @hspandher

hspandher Over a year ago

All is well, if end is well

Collectives™ on Stack Overflow

Encoding problems in Python - 'ascii' codec can't encode character '\xe3' when using UTF-8

2 Answers 2

Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Related