2

This code should write some text to file. When I'm trying to write my text to console, everything works. But when I try to write the text into the file, I get UnicodeEncodeError. I know, that this is a common problem which can be solved using proper decode or encode, but I tried it and still getting the same UnicodeEncodeError. What am I doing wrong?

I've attached an example.

print "(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)".decode("utf-8")%(dict.get('name'),dict.get('description'),dict.get('ico'),dict.get('city'),dict.get('ulCislo'),dict.get('psc'),dict.get('weby'),dict.get('telefony'),dict.get('mobily'),dict.get('faxy'),dict.get('emaily'),dict.get('dic'),dict.get('ic_dph'),dict.get('kategorie')[0],dict.get('kategorie')[1],dict.get('kategorie')[2]) 

(StarBuy s.r.o.,Inzertujte s foto, auto-moto, oblečenie, reality, prácu, zvieratá, starožitnosti, dovolenky, nábytok, všetko pre deti, obuv, stroj....

with open("test.txt","wb") as f: f.write("(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)".decode("utf-8")%(dict.get('name'),dict.get('description'),dict.get('ico'),dict.get('city'),dict.get('ulCislo'),dict.get('psc'),dict.get('weby'),dict.get('telefony'),dict.get('mobily'),dict.get('faxy'),dict.get('emaily'),dict.get('dic'),dict.get('ic_dph'),dict.get('kategorie')[0],dict.get('kategorie')[1],dict.get('kategorie')[2])) 

UnicodeEncodeError: 'ascii' codec can't encode character u'\u010d' in position 50: ordinal not in range(128)

Where could be the problem?

3
  • possible duplicate of Python Unicode Encode Error Commented Jul 12, 2014 at 22:22
  • why are you decoding it? Commented Jul 12, 2014 at 23:33
  • unrelated: you could simplify the string formatting: u"({})".format(','.join(map(your_dict.get, ['name', 'description', 'ico', ..., ic_dph'])+your_dict['kategorie'][:3])) Commented Jun 29, 2015 at 7:10

5 Answers 5

6

To write Unicode text to a file, you could use io.open() function:

#!/usr/bin/env python from io import open with open('utf8.txt', 'w', encoding='utf-8') as file: file.write(u'\u010d') 

It is default on Python 3.

Note: you should not use the binary file mode ('b') if you want to write text.

# coding: utf8 that defines the source code encoding has nothing to do with it.

If you see sys.setdefaultencoding() outside of site.py or Python tests; assume the code is broken.

Sign up to request clarification or add additional context in comments.

Comments

1

@ned-batchelder is right. You have to declare that the system default encoding is "utf-8". The coding comment # -*- coding: utf-8 -*- doesn't do this.

To declare the system default encoding, you have to import the module sys, and call sys.setdefaultencoding('utf-8'). However, sys was previously imported by the system and its setdefaultencoding method was removed. So you have to reload it before you call the method.

So, you will need to add the following codes at the beginning:

import sys reload(sys) sys.setdefaultencoding('utf-8') 

Comments

0

You may need to explicitly declare that python use UTF-8 encoding.

The answer to this SO question explains how to do that: Declaring Encoding in Python

6 Comments

I have it already declared: # -- coding: utf-8 -- on the top of my code
The coding comment on the .py file only affects how the .py source is decoded. It has nothing to do with the way data is decoded.
Then maybe you should ensure the strings in the dict are UTF-8 encoded it before trying to decode them as such. You've only shown us a line of code - which isn't enough for me or probably anyone else to work with if it's a semantic issue.
Do you want to write a unicode string or ASCII string to the file? To write a unicode string, you probably should replace decode('utf-8') with encode('utf-8'). But I'm assuming the strings you are retrieving from the Dict are ASCII encoded. If that is the case, then you don't need the .decode('utf-8') at all.
@SW_user2953243 I'm getting data from web page which has "utf-8" in head. I want to write these data into the file. And it says that 'ordinal not in range(128)'
|
0

For Python 2:

  1. Declare document encoding on top of the file (if not done yet):

    # -*- coding: utf-8 -*-

  2. Replace .decode with .encode:

    with open("test.txt","wb") as f: f.write("(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)".encode("utf-8")%(dict.get('name'),dict.get('description'),dict.get('ico'),dict.get('city'),dict.get('ulCislo'),dict.get('psc'),dict.get('weby'),dict.get('telefony'),dict.get('mobily'),dict.get('faxy'),dict.get('emaily'),dict.get('dic'),dict.get('ic_dph'),dict.get('kategorie')[0],dict.get('kategorie')[1],dict.get('kategorie')[2])) 

Comments

0

Previous answers have the right story, but as far as I can tell not the right solution. You need to set the encoding for the file object:

with open("test.txt", "w", encoding="utf-8") as f: f.write(stuff) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.