Add column from one dataframe to another, for values present in overlapping column

Question

Here is an example of what I am trying to do:

In [46]: import pandas as pd In [47]: df_3 = pd.DataFrame(np.arange(12).reshape(6,2), columns=["a", "z"]) In [48]: df = pd.DataFrame(np.arange(12).reshape(4,3), columns=["a", "b", "c"]) In [49]: df Out[49]: a b c 0 0 1 2 1 3 4 5 2 6 7 8 3 9 10 11 [4 rows x 3 columns] In [50]: df_3 Out[50]: a z 0 0 1 # present in df 1 2 3 2 4 5 3 6 7 # present in df 4 8 9 5 10 11 [6 rows x 2 columns]

I want to add column z to df, but I want the values be added only for rows that match on column a. If not I want a null value in place.

My desired output would look like this:

In [52]: df["z"] = [1, np.nan, 7, np.nan] In [53]: df Out[53]: a b c z 0 0 1 2 1 1 3 4 5 NaN 2 6 7 8 7 3 9 10 11 NaN [4 rows x 4 columns]

I tried naive attempts, like

In [57]: df.merge(df_3, on=["a"]) Out[57]: a b c z 0 0 1 2 1 1 6 7 8 7 [2 rows x 4 columns]

Which does not give me the result I am looking for.

EdChum · Accepted Answer · 2015-03-31 17:10:11Z

Just perform a merge on 'a' column and perform a left type merge:

In [72]: df.merge(df_3, on='a', how='left') Out[72]: a b c z 0 0 1 2 1 1 3 4 5 NaN 2 6 7 8 7 3 9 10 11 NaN

The reason you got this result:

In [57]: df.merge(df_3, on=["a"]) Out[57]: a b c z 0 0 1 2 1 1 6 7 8 7 [2 rows x 4 columns]

is because the default type of merge is 'inner' so values have to exist in both lhs and rhs, see the docs: http://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging

Collectives™ on Stack Overflow

Add column from one dataframe to another, for values present in overlapping column

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related