2

I have read an excel file as follows with Pandas, how could I plot it properly with Matplotlib?

BTW, when I read_clipboard() this format of data, it generates ParserError: Expected 4 fields in line 3, saw 5. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

enter image description here

After manually modified the excel file to the follow format:

 date A_ratio A_price B_ratio B_price 0 2007 12.00 8.90 3.04 6.35 1 2008 13.00 8.78 4.04 6.25 2 2009 14.00 9.08 5.04 6.50 3 2010 14.71 9.21 1.38 6.60 4 2011 15.71 9.22 2.38 6.66 5 2012 16.71 9.27 3.38 6.66 6 2013 16.09 9.56 1.38 6.85 7 2014 17.09 9.71 2.38 6.94 8 2015 18.09 9.31 3.38 6.65 9 2016 19.09 9.88 4.38 6.95 10 2017 20.09 9.76 5.38 6.88 

I have ploted it by the following code, it works, but I don't want change it since my original data is pretty large:

df = df.set_index('date') plt.figure(figsize=(10, 10)) cols = ['A_ratio', 'A_price', 'B_ratio', 'B_price'] df[cols].plot(kind='bar') plt.xticks(rotation=45) plt.xlabel("") 

Output: enter image description here

Please help me, thanks.

1 Answer 1

1

I think you can use map with join for flatten MultiIndex:

df = df.set_index('date') df.columns = df.columns.map('_'.join) plt.figure(figsize=(10, 10)) cols = ['A_ratio', 'A_price', 'B_ratio', 'B_price'] df[cols].plot(kind='bar') plt.xticks(rotation=45) plt.xlabel("") 

Or you can select Multiindex values by tuples:

df = df.set_index('date') plt.figure(figsize=(10, 10)) cols = [('A','ratio'), ('A','price'), ('B','ratio'),('B','price')] df[cols].plot(kind='bar') plt.xticks(rotation=45) plt.xlabel("") 
Sign up to request clarification or add additional context in comments.

6 Comments

Sorry, it generates ValueError: cannot handle a non-unique multi-index! with df = df.set_index('date'), I think problem comes from multiple headers. The excel file hasn't been read correctly yet by pandas.
@ahbon - Do you try pd.read_excel(file, header=[0, 1], index_col=[0]) ?
This seems read file properly, but it generate KeyError: "None of ['date'] are in the columns" for df.set_index('date') as well.
@ahbon - If omit df = df.set_index('date') ? Because it should create date to index
If check df = pd.read_excel(file, header=[0, 1], index_col=[0]) it convert date column to index, so then use df = df.sort_index() instead df.sort_values(by = 'date')
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.