unable to decode this string using python

Question

I have this text.ucs file which I am trying to decode using python.

file = open('text.ucs', 'r') content = file.read() print content

My result is

\xf\xe\x002\22

I tried doing decoding with utf-16, utf-8

content.decode('utf-16')

and getting error

Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\encodings\utf_16.py", line 16, in decode return codecs.utf_16_decode(input, errors, True) UnicodeDecodeError: 'utf16' codec can't decode bytes in position 32-33: illegal encoding

Please let me know if I am missing anything or my approach is wrong

Edit: Screenshot has been asked

Try encoding='utf_16_be' (stackoverflow.com/a/14488478/1388292) — Jacques Gaudin
– Jacques Gaudin, Commented May 7, 2018 at 10:42

filmor · Accepted Answer · 2018-05-07 10:57:22Z

1

The string is encoded as UTF16-BE (Big Endian), this works:

content.decode("utf-16-be")

edited May 7, 2018 at 10:57

answered May 7, 2018 at 10:42

filmor

32.6k6 gold badges53 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

cyborg Over a year ago

@JacquesGaudin unfortunately both are not working but as per python docs, i see '-' and not '_'

filmor Over a year ago

@cyborg I executed this on the bytes that you provided just now, worked fine. The names with dashes and underscores are equivalent, first paragraph of docs.python.org/3/library/codecs.html#standard-encodings

cyborg Over a year ago

>>> content.decode("utf_16_be") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python27\lib\encodings\utf_16_be.py", line 16, in decode return codecs.utf_16_be_decode(input, errors, True) UnicodeDecodeError: 'utf16' codec can't decode byte 0x5c in position 64: truncat ed data

artur1214 · Accepted Answer · 2018-05-07 13:24:53Z

oooh, as i understand you using python 2.x.x but encoding parameter was added only in python 3.x.x as I know, i am doesn't master of python 2.x.x but you can search in google about io.open for example try:

file = io.open('text.usc', 'r',encoding='utf-8') content = file.read() print content

but chek do you need import io module or not

MrLeeh · Accepted Answer · 2018-05-07 10:57:01Z

0

You can specify which encoding to use with the encoding argument:

with open('text.ucs', 'r', encoding='utf-16') as f: text = f.read()

edited May 7, 2018 at 10:57

MrLeeh

5,5996 gold badges37 silver badges53 bronze badges

answered May 7, 2018 at 10:37

artur1214

4244 silver badges10 bronze badges

5 Comments

cyborg Over a year ago

the error message I got: Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'encoding' is an invalid keyword argument for this function

artur1214 Over a year ago

maybe you forgotten that if you usin with open() you need to set name for this for example : with open('text.ucs', 'r', encoding='utf-16') as file

cyborg Over a year ago

I did it, I am aware of with usage :)

artur1214 Over a year ago

really i don't know why it's doesn't work, i tested it 20 secs ago and this construction haven't any errors my full code is with open(source, 'r', encoding='utf-8') as csvfile:...

cyborg Over a year ago

I have added screenshot in question, please check

Skiller Dz · Accepted Answer · 2018-05-07 11:03:40Z

0

your string need to Be Uncoded With The Coding utf-8 you can do What I Did Now for decode your string

f = open('text.usc', 'r',encoding='utf-8') print f

edited May 7, 2018 at 11:03

answered May 7, 2018 at 10:43

Skiller Dz

95911 silver badges17 bronze badges

2 Comments

Jacques Gaudin Over a year ago

A few words to explain why you think this fixes the OP issue would be a good idea. stackoverflow.com/help/how-to-answer

Skiller Dz Over a year ago

@JacquesGaudin thanks for that i will be more attentive about my answer next time

Collectives™ on Stack Overflow

unable to decode this string using python

4 Answers 4

3 Comments

Comments

5 Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

Comments

5 Comments

2 Comments

Linked

Related