Combining excel spreadsheets using dictionaries

Question

I've made a simple example where I am trying to merge two spreadsheets. The aim is to create a spreadsheet with 'Name of City', 'State' and 'Population' as the three columns. I think the way to do it is to use dictionaries.

I've had a go at it myself and this is what I have so far.

code data

Please include your code and data as text in your question, not as images. — Thierry Lathuille
– Thierry Lathuille, Commented Apr 14, 2017 at 12:11
easiest way would be to use pandas.read_excel to get a 2 DafaFrames and then merge these — Maarten Fabré
– Maarten Fabré, Commented Apr 14, 2017 at 12:31

Felix · Accepted Answer · 2017-04-14 12:39:29Z

Do you know the pandas package?

You can read data from an excel file to a DataFrame with pandas.read_excel and then merge the two dataframes on the Name of City column.

Here's a short example that shows how easy merging two dataframes is using pandas:

In [1]: import pandas as pd In [3]: df1 = pd.DataFrame({'Name of City': ['Sydney', 'Melbourne'], ...: 'State': ['NSW', 'VIC']}) In [4]: df2 = pd.DataFrame({'Name of City': ['Sydney', 'Melbourne'], ...: 'Population': [1000000, 200000]}) In [5]: result = pd.merge(df1, df2, on='Name of City') In [6]: result Out[6]: Name of City State Population 0 Sydney NSW 1000000 1 Melbourne VIC 200000

Community · Accepted Answer · 2017-05-23 12:09:51Z

Perhaps this?

import os import os.path import xlrd import xlsxwriter file_name = input("Decide the destination file name in DOUBLE QUOTES: ") merged_file_name = file_name + ".xlsx" dest_book = xlsxwriter.Workbook(merged_file_name) dest_sheet_1 = dest_book.add_worksheet() dest_row = 1 temp = 0 path = input("Enter the path in DOUBLE QUOTES: ") for root,dirs,files in os.walk(path): files = [ _ for _ in files if _.endswith('.xlsx') ] for xlsfile in files: print ("File in mentioned folder is: " + xlsfile) temp_book = xlrd.open_workbook(os.path.join(root,xlsfile)) temp_sheet = temp_book.sheet_by_index(0) if temp == 0: for col_index in range(temp_sheet.ncols): str = temp_sheet.cell_value(0, col_index) dest_sheet_1.write(0, col_index, str) temp = temp + 1 for row_index in range(1, temp_sheet.nrows): for col_index in range(temp_sheet.ncols): str = temp_sheet.cell_value(row_index, col_index) dest_sheet_1.write(dest_row, col_index, str) dest_row = dest_row + 1 dest_book.close() book = xlrd.open_workbook(merged_file_name) sheet = book.sheet_by_index(0) print "number of rows in destination file are: ", sheet.nrows print "number of columns in destination file are: ", sheet.ncols

It seems like this should work just as well.

import pandas as pd # filenames excel_names = ["xlsx1.xlsx", "xlsx2.xlsx", "xlsx3.xlsx"] # read them in excels = [pd.ExcelFile(name) for name in excel_names] # turn them into dataframes frames = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in excels] # delete the first row for all frames except the first # i.e. remove the header row -- assumes it's the first frames[1:] = [df[1:] for df in frames[1:]] # concatenate them.. combined = pd.concat(frames) # write it out combined.to_excel("c.xlsx", header=False, index=False)

How to concatenate three excels files xlsx using python?

Collectives™ on Stack Overflow

Combining excel spreadsheets using dictionaries

2 Answers 2

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Linked

Related