decode/encode error django python

Question

I have a list of keywords
keywords = [u'encendió', u'polémica']

I am trying to load them to a django model:

class myKeywords(model.Model): keyword = models.charField() def __unicode__(self): return self.keyword.encode('utf-8')

This is what i am trying:

for k in keywords: keyObj, created = myKeywords.objects.get_or_create(keyword=k.decode('utf-8')) print created, keyObj

However, I keep getting the django.utils.encoding.DjangoUnicodeDecodeError: 'ascii' codec can't decode byte.

I have tried:

adding/removing u from infront of the keyword
removing decode('utf-8') while creating the keyword object -- doing this successfully creates and saves the object if there is a u appended infront of the keyword
removing encode('utf-8') from the __unicode__(self) function. -- doing this successfully prints the keyword

So, the only configuration that is working is as follows:

keep u appended in-front of the keyword
dont do decode('utf-8') or encode('utf-8') anyplace else

But I am not sure if this is the right way of doing this. Ideally I should be reading a keyword and decoding it as utf-8 and then be saving it to the db. Any suggestions?

Alasdair · Accepted Answer · 2015-07-14 17:04:24Z

3

The __unicode__ method should return a unicode string, not a byte string. Therefore you should remove the encode() from your __unicode__ method.

If your keywords have the u'' prefix, then they are unicode strings as well, and don't have to be decoded either.

edited Jul 14, 2015 at 17:04

answered Jul 14, 2015 at 16:56

Alasdair

310k59 gold badges605 silver badges534 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Alasdair Over a year ago

If you want to remove the u to make your code cleaner, then you might want to use from __future__ import unicode_literals. Otherwise, removing the u isn't a good idea. Unicode bugs occur when converting between unicode and byte strings. If you only deal with unicode strings, then you avoid those problems.

Peter DeGlopper Over a year ago

Yes, encode is why you're getting that specific error. __unicode__ should (unsurprisingly) return a Unicode object, not encoded bytes.

Alasdair Over a year ago

Yes, you should remove the encode() from the unicode method, for the reason that Peter gives.

Peter DeGlopper Over a year ago

In general, you should expect to handle the decoding from bytes to Unicode when taking data from an external source, like a file (best handled with codecs.open on Python 2, Python 3 has better tools) or a web request. Django handles it for you when processing the web request, though, so it's rare that you need to do this at all in Django.

Peter DeGlopper Over a year ago

The error specifically comes from mixing encoded bytestrings with Unicode. encode on a Unicode object is perfectly valid depending on what you want, but here it was causing your __unicode__ method to return an object of the wrong type.

|

Rahul Gupta · Accepted Answer · 2015-07-14 17:20:39Z

You don't need to encode() the strings to utf-8 in __unicode__() method as Django returns all the strings from the database as unicode.

From docs,

Because all strings are returned from the database as Unicode strings, model fields that are character based (CharField, TextField, URLField, etc) will contain Unicode values when Django retrieves data from the database. This is always the case, even if the data could fit into an ASCII bytestring.

Since your keywords are already unicode strings(as prefixed by 'u'), you don't need to do decode() while printing. Remove the decode() also.

Your code should look like:

models.py

class myKeywords(model.Model): keyword = models.charField() def __unicode__(self): return u'%s'%(self.keyword) keywords = [u'encendió', u'polémica'] for k in keywords: keyObj, created = myKeywords.objects.get_or_create(keyword=k) print created, keyObj

Collectives™ on Stack Overflow

decode/encode error django python

2 Answers 2

8 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

Comments

Related