-1

I have excel sheets in a workbook that have multiple columns. Those columns hold different headings. Some of them might hold the same data. Those reports are generated using pandas. How to compare all columns on each sheet and if data on any of these columns are the same delete. Headings are

2014 2015 2016 2017 2018 12. 14. 12. 15. 20 11. 11. 11. 12. 21 

You can see 2014 and 2016 hold the same data. How to delete 2016 if it matched 2014? I have multiple sheets with multiple years.

2
  • thanks this is working however, i forget to say each column is 2014 then under that the indicator name then the data. the T delete will work if the column name is only 2014. any help would be appreciated mazin Commented Feb 24, 2018 at 2:19
  • 1
    @Mazin Please edit the question itself, rather than make comments about what you should or should not have written. Commented Feb 24, 2018 at 2:47

1 Answer 1

1

Here you go:

 import pandas as pd import numpy as np data = {'2012': ['1', '2', '3', '4', '5'], '2013': ['2', '2', '2', '2', '2'], '2014': ['1', '2', '3', '4', '5']} df = pd.DataFrame(data, columns=['2012', '2013', '2014']) results = df.T.drop_duplicates().T 
Sign up to request clarification or add additional context in comments.

1 Comment

Ind1 Ind1 Ind1 Ind1 Ind2 Ind2 Ind2 2014 2015 2016 2017 2014 2015 2016 dhb1 2 3 2 3 2 3 2 dhb2 2 3 2 3 2 3 2 I need to delete data under Ind1 year 2016 and year 2017 and Ind2 year 2016 because they are equal to another column under the same indicator. thank you

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.