How to get a value of a column from another df based on index? pandas

Question

I have 2 data frames, and i'd like to get the first data frame that contains data from the second data frame, based on the their index. The catch is that I do it iteratively and the columns index numbers of only the first df increase by one with each iteration, so it causes error.

example to that would be: First df after first iteration:

 0 440 7.691

Second df after first iteration (doesn't change after each iteration):

 1 0 M 1 M 2 M 3 M 4 M .. .. 440 B 441 M 442 M

when i ran the code, I get the wanted df:

df_with_label = first_df.join(self.second_df) 0 1 440 7.691 B

After second iteration, my first df in now:

 1 3 10.72

and when i run the same df_with_label = first_df.join(self.second_df) i'd like to get:

 1 2 3 10.72 M

But I get the error:

ValueError: columns overlap but no suffix specified: Int64Index([1], dtype='int64')

I'm guessing it has a problem with the fact that the index of the column of the first df is 1 after the second iteration, but don't know how to fix it. i'd like to keep the index of the first column to keep increasing.

The best solution would be to give the second column different name, so like:

 1 class 3 10.72 M

Any idea how to fix it?

I gave an example of both df for 2 iterations, is it not clear enough? — Bella
– Bella, Commented Sep 3, 2019 at 9:37
surely it is not necessary to iterate. Simply so that it is understood better you should create input dataframe and its corresponding output that you expect to obtain — ansev
– ansev, Commented Sep 3, 2019 at 9:40

baccandr · Accepted Answer · 2019-09-03 09:49:53Z

1

If I got it right your second dataframe doesn't change with iterations so why don't you just change its column name once and for all:

second_df.columns=['colname']

this should solve your naming conflicts.

answered Sep 3, 2019 at 9:49

baccandr

1,13010 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Bella Over a year ago

Thanks. That's a possibility that worked. Is there a way to reset the column index of the first df at each iteration if I'd like to go with that direction?

baccandr Over a year ago

I'm not sure how your cycle works but you should be able to update the column name of your dataframe at each iteration by using the same command.

Georgina Skibinski · Accepted Answer · 2019-09-03 09:44:19Z

Try:

df_with_label = first_df.join(self.second_df, rsuffix = "_2")

The thing is - df_with_label and second_df both have column 1, so the rsuffix will add "_2" to the second_df column name "1" := "1_2". You join on indexes, so every other column is shown on default - so you need to avoid naming conflicts.

REF https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.join.html

Collectives™ on Stack Overflow

How to get a value of a column from another df based on index? pandas

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related