0

I've an encoding problem with my script. This is my script :

def parse_airfields(): html = urlopen('https://www.sia.aviation-civile.gouv.fr/aip/enligne/FRANCE/AIRAC-2015-09-17/html/eAIP/FR-AD-1.3-fr-FR.html').read() html = html.decode('utf-8') soup = BeautifulSoup(html, 'lxml') # A lot of work [....] return airfields if __name__ == '__main__': airfields = parse_airfields() for airfield in airfields: for value in airfield.values(): if isinstance(value, str): value.encode('utf-8') with open('airfields.json', 'w') as airfields_file: json.dump(airfields, airfields_file, indent=4, sort_keys=True) 

I tried without encode() and without decode() but I have the same résult... An encoding problem in my JSON file: encoding problem with my json

Why ? Thanks for your help!

1
  • 1
    What problem? Also, that line that calls the encode() method disposes of the result. Commented Sep 25, 2015 at 4:00

1 Answer 1

1

str.encode and bytes.decode don't modify the value in place; you're not assigning the return value of value.encode('utf-8') so you haven't actually changed anything. Of course, I don't think you really want to; the json module works with text (str), not binary data (bytes).

The problem is that strict JSON usually doesn't include non-ASCII characters in its strings; it uses the escapes, e.g. \u00b0. Python will output the utf-8 directly if you tell it to though, just add ensure_ascii=False to the arguments of your json.dump(...) call.

Sign up to request clarification or add additional context in comments.

1 Comment

Wow! That's perfect! Thanks for your explaination. Have a nice day.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.