I am creating XML file in Python and there's a field on my XML that I put the contents of a text file. I do it by
f = open ('myText.txt',"r") data = f.read() f.close() root = ET.Element("add") doc = ET.SubElement(root, "doc") field = ET.SubElement(doc, "field") field.set("name", "text") field.text = data tree = ET.ElementTree(root) tree.write("output.xml") And then I get the UnicodeDecodeError. I already tried to put the special comment # -*- coding: utf-8 -*- on top of my script but still got the error. Also I tried already to enforce the encoding of my variable data.encode('utf-8') but still got the error. I know this issue is very common but all the solutions I got from other questions didn't work for me.
UPDATE
Traceback: Using only the special comment on the first line of the script
Traceback (most recent call last): File "D:\Python\lse\createxml.py", line 151, in <module> tree.write("D:\\python\\lse\\xmls\\" + items[ctr][0] + ".xml") File "C:\Python27\lib\xml\etree\ElementTree.py", line 820, in write serialize(write, self._root, encoding, qnames, namespaces) File "C:\Python27\lib\xml\etree\ElementTree.py", line 939, in _serialize_xml _serialize_xml(write, e, encoding, qnames, None) File "C:\Python27\lib\xml\etree\ElementTree.py", line 939, in _serialize_xml _serialize_xml(write, e, encoding, qnames, None) File "C:\Python27\lib\xml\etree\ElementTree.py", line 937, in _serialize_xml write(_escape_cdata(text, encoding)) File "C:\Python27\lib\xml\etree\ElementTree.py", line 1073, in _escape_cdata return text.encode(encoding, "xmlcharrefreplace") UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 243: ordina l not in range(128) Traceback: Using .encode('utf-8')
Traceback (most recent call last): File "D:\Python\lse\createxml.py", line 148, in <module> field.text = data.encode('utf-8') UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 227: ordina l not in range(128) I used .decode('utf-8') and the error message didn't appear and it successfully created my XML file. But the problem is that the XML is not viewable on my browser.
decodeinstead ofencode.decode, but the file is not viewable on my browser.# -*- coding: utf-8 -*-serves only to insert non ASCII characters in the python sources. It doesn't affect encoding/decoding of strings in any way. Also, if the filemyText.txtisn't ASCII you should usecodecs.openand provide the right encoding:codecs.open('myText.txt', 'r', 'utf-8').tree.writeif your text is not just ASCII (see also the docs)