0

Set-up

I have two pandas data frames df1 and df2, each containing two columns with observations for id and its respective url,

| id | url | | id | url | ------------ ------------ | 1 | url | | 2 | url | | 2 | url | | 4 | url | | 3 | url | | 3 | url | | 4 | url | | 5 | url | | 6 | url | 

Some observations are in both dfs, which is clear from the id column, e.g. observation 2 and it's url are in both dfs.

The positioning within the dfs of those 'double' observations does not necessarily have to be the same, e.g. observation 2 is in first row in df1 and second in df2.

Lastly, the dfs do not necessarily have the same number of observations, e.g. df1 has four observations while df2 has five.


Problem

I want to elicit all unique observations in df2 and insert them in a new df (df3), i.e. I want to obtain,

| id | url | ------------ | 5 | url | | 6 | url | 

How do I go about?

I've tried this answer but cannot get it to work for my two-column dataframes.

I've also tried this other answer, but this gives me an empty common dataframe.

3
  • 1
    Are you after this: stackoverflow.com/questions/28901683/…? if so then it's a dupe Commented Jul 12, 2017 at 8:30
  • Thank you, but I cannot get that to work, see my question. Commented Jul 12, 2017 at 8:35
  • Please edit your post with your attempts from that question, stating that it doesn't work is not informative. I believe it would work but you need to prove it doesn't, also the answer from Greg should work, if it does then it's a dupe, if it doesn't then demonstrate this Commented Jul 12, 2017 at 8:59

2 Answers 2

1

Possibly something like this: df3 = df2[~df2.id.isin(df1.id.tolist())]

Sign up to request clarification or add additional context in comments.

Comments

1

ID numbers make good index names:

df1.index = df1.id df2.index = df2.id 

Then use the very straightforward index.difference:

diff_index = df2.index.difference(df1.index) df3 = df2.loc[diff_index] 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.