Encoding problems at the end of the execution

Question

I've an encoding problem with my script. This is my script :

def parse_airfields(): html = urlopen('https://www.sia.aviation-civile.gouv.fr/aip/enligne/FRANCE/AIRAC-2015-09-17/html/eAIP/FR-AD-1.3-fr-FR.html').read() html = html.decode('utf-8') soup = BeautifulSoup(html, 'lxml') # A lot of work [....] return airfields if __name__ == '__main__': airfields = parse_airfields() for airfield in airfields: for value in airfield.values(): if isinstance(value, str): value.encode('utf-8') with open('airfields.json', 'w') as airfields_file: json.dump(airfields, airfields_file, indent=4, sort_keys=True)

I tried without encode() and without decode() but I have the same résult... An encoding problem in my JSON file:

Why ? Thanks for your help!

What problem? Also, that line that calls the encode() method disposes of the result. — Ignacio Vazquez-Abrams
– Ignacio Vazquez-Abrams, Commented Sep 25, 2015 at 4:00

ShadowRanger · Accepted Answer · 2015-09-25 04:03:37Z

str.encode and bytes.decode don't modify the value in place; you're not assigning the return value of value.encode('utf-8') so you haven't actually changed anything. Of course, I don't think you really want to; the json module works with text (str), not binary data (bytes).

The problem is that strict JSON usually doesn't include non-ASCII characters in its strings; it uses the escapes, e.g. \u00b0. Python will output the utf-8 directly if you tell it to though, just add ensure_ascii=False to the arguments of your json.dump(...) call.

Wow! That's perfect! Thanks for your explaination. Have a nice day.

Collectives™ on Stack Overflow

Encoding problems at the end of the execution

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related