Python Pandas extra commas

Question

im making a little work with csv and pandas and I must merge two CSV lists on one and delete the duplicates but the final output add extra commas to the last column and I don´t know why

I have two CSV lists like this:

 DESCRIPTION EXTRAS ADDRESS AVAILABLE 1 House WiFi CP 432 1 2 Farm NONE CP 345 1 3 House Wifi CP 315 1 DESCRIPTION EXTRAS ADDRESS AVAILABLE 1 House WiFi CP 437 0 2 House Wifi CP 315 0

And when I merge the both the result is: (the number of "," is absolutely random)

ID DESCRIPTION EXTRAS ADDRESS AVAILABLE,,,,, 1 House WiFi CP 432 1,,,,,, 2 Farm NONE CP 345 1,,,, 3 House Wifi CP 315 1,,,,,, 1 House WiFi CP 437 0,,,,,

This is my code:

with open("C:\\files\\20171412123920-1\\20171412123920-1Total.csv", "rt", encoding="utf-8") as f2: reader = csvCSV.reader(f) for row in reader: merged.append(row) with open("C:\\files\\20171412123920-1\\20171412123920-1.csv", "rt", encoding="utf-8") as f: readerTotal = csvCSV.reader(f2) for row in readerTotal: merged.append(row) with open("C:\\Users\\Desktop\\Test\\Python\\20171412123920-1Comparacion.csv", "wb") as csvfile: spamwriter = csv.writer(csvfile,dialect='excel', encoding='utf-8') spamwriter.writerow(["ID","DESCRIPTION","EXTRAS","ADDRESS","AVAILABLE"]) for row in merged: spamwriter.writerow(row) df=pd.read_csv("C:\\Users\\Desktop\\Test\\Python\\20171412123920-1Comparacion.csv", error_bad_lines=False) df.to_string(index=False) df.drop_duplicates(['DESCRIPTION'], keep='first', inplace = True) df = df.reset_index(drop=True) df.set_index('ID', inplace = True) df.to_csv("C:\\Users\\Desktop\\Test\\Python\\201714121239201Comparacion.csv")

Really I´m new in this kind of stuff and I took it from a tutorial, the "rt" means "read in default text mode" — Jonan87
– Jonan87, Commented Dec 15, 2017 at 9:52
First of all, always use pd.read_csv when loading CSVs into a dataframe. I think this problem is happening because of the manner in which you're reading those CSVs. — cs95
– cs95, Commented Dec 15, 2017 at 9:54

Alkesh Mahajan · Accepted Answer · 2017-12-15 12:51:55Z

First you will merge both csv file in pandas dataframe. Then drop duplicate data from dataframe.

import pandas as pd df1=pd.read_csv('first.csv') df2=pd.read_csv('second.csv') frames = [df1, df2] result=pd.concat(frames) df5 = pd.DataFrame(result) df5.drop_duplicates() print(df5)

Collectives™ on Stack Overflow

Python Pandas extra commas

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related