How can I fix the 'got 4 columns instead of 7' error when loading data from a .dat file in Python using NumPy?

Question

Okay I'm new to python but have to use it for my physics class. Basically I don't get my error loading data from a .dat file

The .dat file looks something alike

; line 8
; time (s) motorpos (steps) azimuth (deg) signal_magnitude
0.90 0 0.000 1.24230e-01
3.07 6000 3.000 1.24530e-01
5.27 12000 6.000 1.23240e-01
7.55 18000 9.000 1.23330e-01

My modest try was

import numpy as np
import matplotlib.pyplot as plt

data = np.genfromtxt('Data.dat', comments=';', skip_header=8, skip_footer=1, names=True, type=None)
print(data)

But i got:
Line #10 (got 4 columns instead of 7)

I guess my problem is that I use the wrong delimiter, but "by default, any consecutive whitespaces should act as delimiter." So I'm kind of lost here

Hello, welcome to Stack Overflow! It is very helpful when the answer is formatted right, because it is more readable. You can learn more about formatting here: stackoverflow.com/editing-help. — LoukasPap
– LoukasPap, Commented May 23, 2023 at 13:48
The header line has units between parentheses, but more importantly, as separate "words" in that line, surrounded by whitespace. Thus, that line is interpreted as a header for 7 columns. But the resulting data is just 4 columns, the actual values. — 9769953
– 9769953, Commented May 23, 2023 at 13:57
You'll need to either fix the input file, by e.g. removing the units or appending them directly to the corresponding column name (e.g., "time[s] motorpos[steps] azimuth[deg] signal_magnitude"), or skip the header entirely, and provide the column names yourself in the code, using the names argument set to a list of strings in genfromtxt. — 9769953
– 9769953, Commented May 23, 2023 at 13:58

9769953 · Accepted Answer · 2023-05-23 14:05:56Z

The header line has units between parentheses, but more importantly, as separate "words" in that line, surrounded by whitespace:

; time (s) motorpos (steps) azimuth (deg) signal_magnitude

Thus, that line is interpreted as a header for 7 columns. But the resulting data is just 4 columns, the actual values. (And indeed, the fact that the header line is a comment is ignored here; the documentation about this says

If names is True, the field names are read from the first line after the first skip_header lines. This line can optionally be preceded by a comment delimiter.

So you could also try and remove the names argument, and let the header line become a comment line, but you'd miss the information about the column names.)

You'll need to either fix the input file, by e.g. removing the units or appending them directly to the corresponding column name, for example,

time[s] motorpos[steps] azimuth[deg] signal_magnitude

Or you have to skip the header entirely, and provide the column names yourself in the code, using the names argument set to a list of strings in genfromtxt:

data = np.genfromtxt( sys.argv[1], comments=';', skip_header=9, skip_footer=1, names=["time", "motorpos", "azimuth", "signal_magnitude"], dtype=None)

armalko · Accepted Answer · 2023-05-23 17:02:57Z

Remove names=True parameter and change type parameter to dtype.

import numpy as np data = np.genfromtxt('Data.dat', comments=';', skip_header=8, skip_footer=1, dtype=None) print(data)

It tries to parse first line in your file as names of columns.

Alternatively you can pass list of names for your columns:

data = np.genfromtxt('Data.dat', comments=';', skip_header=8, skip_footer=1, dtype=None, names=["time (s)", "motorpos (steps)", "azimuth (deg)", "signal_magnitude"])

For more information, read the documentation

Collectives™ on Stack Overflow

How can I fix the 'got 4 columns instead of 7' error when loading data from a .dat file in Python using NumPy?

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related