2

I'm getting this common error in my code :

Exception Type: UnicodeEncodeError Exception Value:'ascii' codec can't encode character u'\xe9' in position 6: ordinal not in range(128) 

Why ? Because I'm handling firstnames with french accents.

This is my code :

if 'rechercheGED' in request.GET: query_social_number = request.GET.get('q1social') sort_params = {} Individu_Recherche.set_if_not_none(sort_params, 'NumeroIdentification__iexact', query_social_number) query_lastname_list = Individu_Recherche.Recherche_Get(Individu, sort_params) lastname = query_lastname_list.Nom firstname = query_lastname_list.Prenom NIU = query_lastname_list.NumeroIdentification title = str(lastname + "_" + firstname + "_" + NIU) 

The issue comes from : firstname = query_lastname_list.Prenom

Because in my case the firstname is Jérôme

I tried some things :

1) insert at the beginning :

#-*- coding: utf-8 -*- from __future__ import unicode_literals 

2) Use firstname = query_lastname_list.Prenom.encode('utf-8') and firstname = query_lastname_list.Prenom.decode('utf-8')

But up to now, impossible remove this error and handle data with accents.

Do you have any idea ?

EDIT :

This is the full Traceback :

Environment: Request Method: GET Request URL: http://localhost:8000/Identification/Person/Research/?q1social=19910-00001-634239-2&rechercheGED=Rechercher Django Version: 1.10.3 Python Version: 2.7.12 Installed Applications: ['Institution', 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'bootstrapform', 'django_countries', 'debug_toolbar', 'chartit', 'Configurations', 'Home', 'Authentication', 'Identity', 'rest_framework'] Installed Middleware: ['django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.locale.LocaleMiddleware', 'django.middleware.common.CommonMiddleware', 'django.middleware.csrf.CsrfViewMiddleware', 'django.middleware.gzip.GZipMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.contrib.auth.middleware.SessionAuthenticationMiddleware', 'django.contrib.messages.middleware.MessageMiddleware', 'django.middleware.clickjacking.XFrameOptionsMiddleware', 'django.middleware.security.SecurityMiddleware', 'debug_toolbar.middleware.DebugToolbarMiddleware', 'DatasystemsCORE.middleware.OnlineNowMiddleware'] Traceback: File "/usr/local/lib/python2.7/site-packages/django/core/handlers/exception.py" in inner 39. response = get_response(request) File "/usr/local/lib/python2.7/site-packages/django/core/handlers/base.py" in _legacy_get_response 249. response = self._get_response(request) File "/usr/local/lib/python2.7/site-packages/django/core/handlers/base.py" in _get_response 187. response = self.process_exception_by_middleware(e, request) File "/usr/local/lib/python2.7/site-packages/django/core/handlers/base.py" in _get_response 185. response = wrapped_callback(request, *callback_args, **callback_kwargs) File "/usr/local/lib/python2.7/site-packages/django/contrib/auth/decorators.py" in _wrapped_view 23. return view_func(request, *args, **kwargs) File "/Users/valentinjungbluth/Desktop/Django/DatasystemsCORE/DatasystemsCore/DatasystemsCORE/Identity/views.py" in IdentityIndividuResearching 454. title = str(lastname + "_" + firstname + "_" + NIU) Exception Type: UnicodeEncodeError at /Identification/Person/Research/ Exception Value: 'ascii' codec can't encode character u'\xe9' in position 6: ordinal not in range(128) 
6
  • What python and django versions are you using? Commented Oct 3, 2017 at 10:08
  • Python 2.7 and Django 1.10.3 ;) Commented Oct 3, 2017 at 10:08
  • Is there some reason why you aren't using Python 3? It has much better Unicode handling, and Python 2 will reach its End of Life in 2020. Commented Oct 3, 2017 at 10:10
  • None, just because I began to study Python 3-4 years ago and it was Python 2.X With Python 3 I will not have this kind of issue ? Commented Oct 3, 2017 at 10:11
  • 1
    First of all, show the full traceback; we can't even see where the error is happening. Secondly, show how you tried 'encode' and 'decode'. (And thirdly, stop randomly trying adding a coding declaration or unicode_literals; both only apply to literal characters in your code.) Commented Oct 3, 2017 at 10:11

2 Answers 2

5

Since you're using Python 2.x, the problem does lie with the last line of the traceback.

title = unicode(str(lastname + "_" + firstname + "_" + NIU)) 

Either lastname, firstname or NIU contain characters that can not be represented by 7-bit ASCII, which is what str does (in Python 2).

Django comes with the useful functions force_text and force_bytes for these sorts of string casts, and also it's a good idea to use string interpolation instead of + when doing this sort of stuff:

from django.utils.text import force_text title = force_text('%s_%s_%s' % (lastname, firstname, NIU)) 
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you it works very well ! Thanks to you I learned a new way to do this concatenation. I have just one question - If I have to use Python 3.x, Is it the same syntax than Python 2.x to do my process ?
Yes. The % syntax works in 3.x too, but be sure to check out pyformat.info to see everything you can do :)
3

In Python 3, your title = str(lastname + "_" + firstname + "_" + NIU) would work they way you expect. In Python 2, you're mixing Unicode and ASCII, so Python "helpfully" tries to convert the Unicode strings to ASCII so it can add them together, and of course that will fail if the Unicode cannot be represented in ASCII.

A simple way to fix this is to make everything Unicode. Eg,

firstname = u'Jérôme' lastname = u'Pécresse' title = lastname + u'_' + firstname print title 

output

Pécresse_Jérôme 

I strongly recommend that you migrate to Python 3, it will make things much more pleasant, once you get used to the different (but far superior) handling of text and bytes.

In the mean time, you may find this article helpful: Pragmatic Unicode, which was written by SO veteran Ned Batchelder.

2 Comments

Thank you for your answer ! The only difference is that I'm picking data with accent from my database and make some processes later. So I can't write u'Jérôme' because this firstname comes from MyModel.FirstnameField
@Deadpool Understood. If query_lastname_list.Nom is an actual Unicode object (which it appears to be from your Traceback), then my code will still work. However, in Python 2 it's often simpler to encode everything to UTF-8, and work with that, as Ned explains in Pragmatic Unicode.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.