0

I've made a simple example where I am trying to merge two spreadsheets. The aim is to create a spreadsheet with 'Name of City', 'State' and 'Population' as the three columns. I think the way to do it is to use dictionaries.

I've had a go at it myself and this is what I have so far.

code data

2
  • 2
    Please include your code and data as text in your question, not as images. Commented Apr 14, 2017 at 12:11
  • easiest way would be to use pandas.read_excel to get a 2 DafaFrames and then merge these Commented Apr 14, 2017 at 12:31

2 Answers 2

3

Do you know the pandas package?

You can read data from an excel file to a DataFrame with pandas.read_excel and then merge the two dataframes on the Name of City column.

Here's a short example that shows how easy merging two dataframes is using pandas:

In [1]: import pandas as pd In [3]: df1 = pd.DataFrame({'Name of City': ['Sydney', 'Melbourne'], ...: 'State': ['NSW', 'VIC']}) In [4]: df2 = pd.DataFrame({'Name of City': ['Sydney', 'Melbourne'], ...: 'Population': [1000000, 200000]}) In [5]: result = pd.merge(df1, df2, on='Name of City') In [6]: result Out[6]: Name of City State Population 0 Sydney NSW 1000000 1 Melbourne VIC 200000 
Sign up to request clarification or add additional context in comments.

Comments

0

Perhaps this?

import os import os.path import xlrd import xlsxwriter file_name = input("Decide the destination file name in DOUBLE QUOTES: ") merged_file_name = file_name + ".xlsx" dest_book = xlsxwriter.Workbook(merged_file_name) dest_sheet_1 = dest_book.add_worksheet() dest_row = 1 temp = 0 path = input("Enter the path in DOUBLE QUOTES: ") for root,dirs,files in os.walk(path): files = [ _ for _ in files if _.endswith('.xlsx') ] for xlsfile in files: print ("File in mentioned folder is: " + xlsfile) temp_book = xlrd.open_workbook(os.path.join(root,xlsfile)) temp_sheet = temp_book.sheet_by_index(0) if temp == 0: for col_index in range(temp_sheet.ncols): str = temp_sheet.cell_value(0, col_index) dest_sheet_1.write(0, col_index, str) temp = temp + 1 for row_index in range(1, temp_sheet.nrows): for col_index in range(temp_sheet.ncols): str = temp_sheet.cell_value(row_index, col_index) dest_sheet_1.write(dest_row, col_index, str) dest_row = dest_row + 1 dest_book.close() book = xlrd.open_workbook(merged_file_name) sheet = book.sheet_by_index(0) print "number of rows in destination file are: ", sheet.nrows print "number of columns in destination file are: ", sheet.ncols 

It seems like this should work just as well.

import pandas as pd # filenames excel_names = ["xlsx1.xlsx", "xlsx2.xlsx", "xlsx3.xlsx"] # read them in excels = [pd.ExcelFile(name) for name in excel_names] # turn them into dataframes frames = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in excels] # delete the first row for all frames except the first # i.e. remove the header row -- assumes it's the first frames[1:] = [df[1:] for df in frames[1:]] # concatenate them.. combined = pd.concat(frames) # write it out combined.to_excel("c.xlsx", header=False, index=False) 

How to concatenate three excels files xlsx using python?

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.