1
$\begingroup$

Pretty new to Python, but as an SEO I'm looking at the benefits of using notebooks in my workflow.

I've got two excel files which I've cleaned and imported into a new notebook using pandas.

I'm trying to compare position changes and create a new dataframe with new columns to show previous, new, and changes in positions.

Have a look at the screengrabs[! of the data below. Thanks in advance.

Dataframe1 Dataframe2

$\endgroup$
1
  • $\begingroup$ Could you briefly say what you want to do? like say you have dataframe 1 having columns {x, y, ...} and dataframe 2 with columns {z, m, n,...}. Now what are you planning to do with them? What would be the final dataframe? $\endgroup$ Commented Jan 20, 2020 at 1:09

1 Answer 1

1
$\begingroup$

You can do a pandas.DataFrame.join if you know how this works.

-- Edit: merge is apparently a better choice: See the example at the end.

I think you need an outer join on Keyword.

This should give a new DataFrame, that contains unique rows for the Keyword in both tables. Some entries may be NULL/None. This indicates that in the old or new table, the keyword was not present and you should treat is as a new keyword, or a keyword that has dropped from the list.

Rename the columns in the new table appropriately, and then apply a smart value between columns, taking into account that some values are NULL.

You can do a similar thing in Excel: https://superuser.com/questions/1023123/how-to-simulate-a-full-outer-join-in-excel


Edit:

Minimalistic example:

import pandas as pd old = pd.DataFrame({'keyword': ['football', 'soccer', 'rugby'], 'position': [2, 1, 3]}) new = pd.DataFrame({'keyword': ['hockey', 'rugby', 'soccer'], 'position': [3, 2, 1]}) old.keyword = old.keyword.astype(str) new.keyword = new.keyword.astype(str) old.set_index(['keyword']) new.set_index(['keyword']) old = old.rename(columns={"position": "position_old"}) new = new.rename(columns={"position": "position_new"}) print(old) print(new) merged = pd.merge(old, new, how='outer', on='keyword') print(merged) 

Output:

 keyword position_old 0 football 2 1 soccer 1 2 rugby 3 keyword position_new 0 hockey 3 1 rugby 2 2 soccer 1 keyword position_old position_new 0 football 2.0 NaN 1 soccer 1.0 1.0 2 rugby 3.0 2.0 3 hockey NaN 3.0 
$\endgroup$
3
  • $\begingroup$ Thanks for your help, will give it try. $\endgroup$ Commented Jan 19, 2020 at 16:35
  • $\begingroup$ Regrettably still can't get it to work, but thanks for your input. $\endgroup$ Commented Jan 19, 2020 at 18:34
  • $\begingroup$ Thanks so much for the input, that makes sense, thanks for your help. $\endgroup$ Commented Jan 20, 2020 at 7:41

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.