0

I have been struggling with this problem all day. I have two dataframes as follows:

Dataframe 1 - Billboards

enter image description here

Dataframe 2

enter image description here

I would like to merge Dataframe 2 with Dataframe 1 based on song to end up with a dataframe that has SongId, Song, Rank and Year. The problem is that there are some variations in how the Songs are stored. ex: Song in Billboard can be macarena bayside boys mix while Song in Dataframe 2 might be macarena. I wanted to find similarities.

1
  • Post dataframes as text not as an image Commented May 28, 2018 at 16:32

2 Answers 2

1

I think you would need to calculate the similarity measure between the songs list in df1 and df2. I gave it a try by calculating cosine distance between the songs in df1 and df2 on randomly generated song list.

from sklearn.feature_extraction.text import TfidfVectorizer vect = TfidfVectorizer(min_df=1) Song1 = ["macarena bayside boys mix", "cant you hear my heart beat", "crying in the chapell", "you were on my mind"] Song2 = ["cause im a man", "macarena", "beat from my heart"] dist_dict = {} match_dict = {} for i in Song1 : for j in Song2 : tfidf = vect.fit_transform([i, j]) distance = ((tfidf * tfidf.T).A)[0,1] if i in dist_dict.keys(): if dist_dict[i] < distance : dist_dict[i] = distance match_dict[i] = j else : dist_dict[i] = distance 

Best match and their cosine distance

Once you have the best match you can lookup the song ID in df2

Sign up to request clarification or add additional context in comments.

Comments

0

The easiest way to do it: 1. Make "Song" as an index column in both dataframes like

df1.set_index('Song', inplace=True) df2.set_index('Song', inplace=True) 
  1. Use join:

joined = df1.join(df2, how='inner')

1 Comment

Check last sentences of OP The problem is that there are some variations in how the Songs are stored. ex: Song in Billboard can be macarena bayside boys mix while Song in Dataframe 2 might be macarena. I wanted to find similarities.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.