I have to make an application in which I have to import all the excel files in the given folder and add it to a dataframe. The dataframe should look as shown:
As seen in the image one of the columns for the dataframe is the name of the file.
I have successfully added that column in the final dataframe and the code is as follows:
import pandas as pd import os import shutil import re path = 'C:/Users/Administrator/Desktop/Zerodha/Day2' lst = os.listdir(path) files = [os.path.join(path,x) for x in lst] print(lst) dataframes_lst = [] for file in files: filename = file.split('/')[-1] dataframe = pd.read_csv(file, usecols=[0,4], names ["date",filename],index_col=["date"]) dataframes_lst.append(dataframe) df = pd.concat(dataframes_lst, axis=1) print(df) df.to_csv('data.csv')
The dataframe which is obtained using this code is as displayed:
For reference I will attach the snippet of one of the excel files:
Also as seen there are many nan values obtained. I tried to remove them by using pd.dropna(inplace = True) function and also by doing as suggested in this post:
But the resultant dataframe still contains the nan values.



