Linked Questions
35 questions linked to/from python pandas remove duplicate columns
25 votes
4 answers
47k views
Fast method for removing duplicate columns in pandas.Dataframe [duplicate]
so by using df_ab = pd.concat([df_a, df_b], axis=1, join='inner') I get a Dataframe looking like this: A A B B 0 5 5 10 10 1 6 6 19 19 and I want to remove its ...
7 votes
1 answer
17k views
Removing multiple columns with the same name except the first one? [duplicate]
If you have multiple columns with the same name in a dataframe, how do you remove all of the columns except the first one?
6 votes
2 answers
7k views
Removing columns containing duplicated data from a pandas dataframe? [duplicate]
If I have a dataframe like below: | Column A | Column B | Column C | Column D | Column E | |:-----------|:---------- |:-----------|:-----------|:-----------| | 1 | 7 | 1 ...
-1 votes
1 answer
152 views
Pandas to delete matched columns [duplicate]
I have excel sheets in a workbook that have multiple columns. Those columns hold different headings. Some of them might hold the same data. Those reports are generated using pandas. How to compare all ...
-1 votes
1 answer
140 views
Unique columns pandas [duplicate]
I have a pandas dataframe of dimensions (20000,3000) and I would there are some duplicated columns but they have different headings. How would I remove those duplicates but keep the original columns ...
-1 votes
1 answer
80 views
How do I drop a duplicate pandas df column based on the name of the column? [duplicate]
How do I drop the duplicate row below "Second Column Name" based on its name? Droping by a specific name is important as depending on the name it could be either keep first or keep last. I ...
403 votes
23 answers
1.1m views
How to show all columns' names on a large pandas dataframe?
I have a dataframe that consist of hundreds of columns, and I need to see all column names. What I did: In[37]: data_all2.columns The output is: Out[37]: Index(['customer_id', 'incoming', 'outgoing'...
130 votes
17 answers
416k views
Concat DataFrame Reindexing only valid with uniquely valued Index objects
I am trying to concat the following dataframes: df1 price side timestamp timestamp 2016-01-04 00:01:15.631331072 0.7286 2 1451865675631331 2016-01-04 ...
79 votes
7 answers
104k views
Pandas concat yields ValueError: Plan shapes are not aligned
In pandas, I am attempting to concatenate a set of dataframes and I am getting this error: ValueError: Plan shapes are not aligned My understanding of .concat() is that it will join where columns are ...
33 votes
4 answers
39k views
Pandas concat failing
I am trying to concat dataframes from the following two csv files: df_a: https://www.dropbox.com/s/slcu7o7yyottujl/df_current.csv?dl=0 df_b: https://www.dropbox.com/s/laveuldraurdpu1/df_climatology....
11 votes
6 answers
23k views
How to remove duplicate columns from a dataframe using python pandas
By grouping two columns I made some changes. I generated a file using python, it resulted in 2 duplicate columns. How to remove duplicate columns from a dataframe?
14 votes
4 answers
22k views
feature_names must be unique - Xgboost
I am running the xgboost model for a very sparse matrix. I am getting this error. ValueError: feature_names must be unique How can I deal with this? This is my code. yprob = bst.predict(xgb....
3 votes
4 answers
7k views
How to select columns which contain non-duplicate from a pandas data frame
I want to select columns which contain non-duplicate from a pandas data frame and use these columns to make up a subset data frame. For example, I have a data frame like this: x y z a 1 2 3 b ...
1 vote
6 answers
5k views
Check if a Series is already in a Dataframe
Let´s say you have some students students = [ ['Jack', 34, 'Sydeny'] , ['Riti', 30, 'Delhi' ] , ['Aadi', 16, 'New York'] ] dfObj = pd.DataFrame(students, columns = ['Name', '...
2 votes
1 answer
7k views
Convert dataframe column with type "object" to a set()
Got a dataframe df with a column "Id" Id 0 -KkJz3CoJNM 1 08QMXEQbEWw 2 0ANuuVrIWJw 3 0pPU8CtwXTo 4 1-wYH2LEcmk I need to convert column "Id" into a set() but set_id = set(df["Id"]...