1

I'm trying to import data from a text file using numpy.loadtxt. This is something I've done many times in the past without issue. However, after generating a new set of text files to import, something about the encoding must be different because I get an error when trying to run the following code:

import numpy as np asdf = np.loadtxt('data/asdf.txt', skiprows=28, max_rows=720, usecols=range(1,722)) 

The error message I receive is:

--------------------------------------------------------------------------- UnicodeDecodeError Traceback (most recent call last) /Users/iangullett/Desktop/coadd/coadd.py in <module>() 61 62 ---> 63 test = np.loadtxt('data/asdf.txt') 64 65 /Users/iangullett/opt/anaconda2/lib/python2.7/site-packages/numpy/lib/npyio.pyc in loadtxt(fname, dtype, comments, delimiter, converters, skiprows, usecols, unpack, ndmin, encoding, max_rows) 1091 try: 1092 while not first_vals: -> 1093 first_line = next(fh) 1094 first_vals = split_line(first_line) 1095 except StopIteration: /Users/iangullett/opt/anaconda2/lib/python2.7/codecs.pyc in decode(self, input, final) 312 # decode input (taking the buffer into account) 313 data = self.buffer + input --> 314 (result, consumed) = self._buffer_decode(data, self.errors, final) 315 # keep undecoded input until the next call 316 self.buffer = data[consumed:] UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 0: invalid start byte 

And for reference, here is a bit of the beginning of the text file I'm trying to read (which is actually very large):

Detector Viewer Listing File : C:\file_path_hidden Title: Date : 10/16/2019 Detector 6, NSCG Surface 1: Max polar angle: 90.00 deg, Total Hits = 224724030 Peak Intensity : 3.957E+005 Watts/Steradian Total Power : 9.915E-001 Watts Data Type : Radiant Intensity Maximum Angle : 90.0000 Detector X : 0.0000 Detector Y : 0.0000 Detector Z : 0.0000 Detector Tilt X : 0.0000 Detector Tilt Y : 180.0000 Detector Tilt Z : 0.0000 Units : Watts/Steradian Radial Pixels : 721, increment 0.1250 degrees Azimuthal Pixels: 720, increment 0.5000 degrees Columns are radial angles, rows are azimuthal angles. Power Values: 1 2 3 4 5 6 7 8 

Any help would be greatly appreciated.

5
  • Did you save the file with utf-16 encoding in notepad by any chance? Commented Mar 1, 2020 at 2:00
  • What version of numpy? Commented Mar 1, 2020 at 2:06
  • I don't believe so. The text files are from an optical simulation package called Zemax Optics Studio. Nothing happens to them between exporting them (from zemax) and attempting to import them. Is there an easy way to check encoding? Or rather, does numpy.loadtxt() expect a certain type of encoding? Commented Mar 1, 2020 at 2:08
  • I'm running numpy version 1.16.5 Commented Mar 1, 2020 at 2:09
  • That should be fine. You need to find out what the output encoding is. Commented Mar 1, 2020 at 2:11

1 Answer 1

1

np.loadtxt has supported an encoding argument since version 1.14.0. It allows you to manually set the encoding. Something like UTF-16 comes to mind as a possibility when the first byte is 0xFF. However the actual determination of the encoding is best made by investigating the program that created your file.

Sign up to request clarification or add additional context in comments.

2 Comments

This was the issue. Adding encoding='UTF-16' to the loadtxt() function solved the problem. Thank you.
It's a pretty windows specific problem to have. UTF-16 is such a PITA.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.