Issue with dropping columns

Question

I'm trying to read in a data set and dropping the first two columns of the data set, but it seems like it is dropping the wrong column of information. I was looking at this thread, but their suggestion is not giving the expected answer. My data set starts with 6 columns, and I need to remove the first two. Elsewhere in threads it has the option of dropping columns with labels, but I would prefer not to name columns only to drop them if I can do it in one step.

df= pd.read_excel('Data.xls', header=17,footer=246) df.drop(df.columns[[0,1]], axis=1, inplace=True)

But it is dropping columns 4 and 5 instead of the first two. Is there something with the drop function that I'm just completely missing?

Print out df.columns and make sure it looks like what you were expecting. Maybe the order got changed somewhere? — JohnE
– JohnE, Commented Nov 14, 2016 at 0:22
OK, that seems to be the issue here. When I do that, I get this output: Index(['Petajoules', 'Gigajoules', '%'], dtype='object') Petajoules is the third column when I visually look at the data set. The first two columns are not included in this. How would I drop those two columns if they aren't in df.columns? — Stephen Juza
– Stephen Juza, Commented Nov 14, 2016 at 1:12
The first two columns might be in the index (multi-index). try df.reset_index() -- that converts index columns into regular columns. — JohnE
– JohnE, Commented Nov 14, 2016 at 1:20
Thanks, it was the index. However, now that leads to a different problem. When I try using df.reset_index(), I get an error ("cannot do an non-empty take from an empty axes"). Even if I got solved that error, isn't reset_index() a destructive process? The third column of the multilevel index is what I need to reset it to. However, when I try using df.set_index, it wouldn't let me reset it to the appropriate column because it is currently an index. — Stephen Juza
– Stephen Juza, Commented Nov 14, 2016 at 2:27

THN · Accepted Answer · 2017-03-06 17:02:01Z

If I understand your question correctly, you have a multilevel index, so drop columns [0, 1] will start counting on non-index columns.

If you know the position of the columns, why not try selecting it directly, such as:

df = df.iloc[:, 3:]

Collectives™ on Stack Overflow

Issue with dropping columns

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related