2

Here is an example of what I am trying to do:

In [46]: import pandas as pd In [47]: df_3 = pd.DataFrame(np.arange(12).reshape(6,2), columns=["a", "z"]) In [48]: df = pd.DataFrame(np.arange(12).reshape(4,3), columns=["a", "b", "c"]) In [49]: df Out[49]: a b c 0 0 1 2 1 3 4 5 2 6 7 8 3 9 10 11 [4 rows x 3 columns] In [50]: df_3 Out[50]: a z 0 0 1 # present in df 1 2 3 2 4 5 3 6 7 # present in df 4 8 9 5 10 11 [6 rows x 2 columns] 

I want to add column z to df, but I want the values be added only for rows that match on column a. If not I want a null value in place.

My desired output would look like this:

In [52]: df["z"] = [1, np.nan, 7, np.nan] In [53]: df Out[53]: a b c z 0 0 1 2 1 1 3 4 5 NaN 2 6 7 8 7 3 9 10 11 NaN [4 rows x 4 columns] 

I tried naive attempts, like

In [57]: df.merge(df_3, on=["a"]) Out[57]: a b c z 0 0 1 2 1 1 6 7 8 7 [2 rows x 4 columns] 

Which does not give me the result I am looking for.

1 Answer 1

2

Just perform a merge on 'a' column and perform a left type merge:

In [72]: df.merge(df_3, on='a', how='left') Out[72]: a b c z 0 0 1 2 1 1 3 4 5 NaN 2 6 7 8 7 3 9 10 11 NaN 

The reason you got this result:

In [57]: df.merge(df_3, on=["a"]) Out[57]: a b c z 0 0 1 2 1 1 6 7 8 7 [2 rows x 4 columns] 

is because the default type of merge is 'inner' so values have to exist in both lhs and rhs, see the docs: http://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.