0

I've created a program to print out some html content. My source file is in utf-8, the server's terminal is in utf-8, and I also use:

out = out.encode('utf8') 

to make sure, the character chain is in utf8. Despite all that, when I use some characters like "ã", "é" in the string out, I get:

UnicodeEncodeError: 'ascii' codec can't encode character '\xe3' in position 84: ordinal not in range(128) 

It seems to me that the print after:

print("Content-Type: text/html; charset=utf-8 \n\n") 

It's being forced to use ASCII encoding... But, I just don't know this would be the case.

2
  • Don't randomly call functions "to make sure". Please show the full context of that line, and the full traceback. Commented Jun 16, 2015 at 11:16
  • 1
    Can you show a small reproducible example? - a program that is your program without the lines that do not produce the error? Commented Jun 16, 2015 at 11:20

2 Answers 2

4

Thanks a lot.

Here it goes how I've solved the encoding problem in with Python 3.4.1: First I've inserted this line in the code to check the output encoding:

print(sys.stdout.encoding) 

And I saw that the output encoding was:

ANSI_X3.4-1968 - 

which stands for ASCII and doesn't support characters like 'ã', 'é', etc.

so, I've deleted the previous line, and inserted theses ones here to change the standard output encoding with theses lines

import codecs if sys.stdout.encoding != 'UTF-8': sys.stdout = codecs.getwriter('utf-8')(sys.stdout.buffer, 'strict') if sys.stderr.encoding != 'UTF-8': sys.stderr = codecs.getwriter('utf-8')(sys.stderr.buffer, 'strict') 

Here is where I found the information:

http://www.macfreek.nl/memory/Encoding_of_Python_stdout

P.S.: everybody says it's not a good practice to change the default encoding. I really don't know about it. In my case it has worked fine for me, but I'm building a very small and simple webapp.

Sign up to request clarification or add additional context in comments.

Comments

3

I guess you should read the file as unicode object, that way you might not need to encode it.

import codecs file = codecs.open('file.html', 'w', 'utf-8') 

2 Comments

Yes I read all the files using the procedure you described... The problem was the standard out. It was set to ANSI_X3.4-1968 - So even I've read all the files as utf-8, when I used the print to generate the hmtl, it was being convert to ascii. Thank a lot anyway @hspandher
All is well, if end is well

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.