0

I would want to find a way in python to merge the files on 'seq' but return all the ones with the same id, in this example only the lines with id 2 would be removed.

File one:

seq,id CSVGPPNNEQFF,0 CTVGPPNNEQFF,0 CTVGPPNNERFF,0 CASRGEAAGFYEQYF,1 RASRGEAAGFYEQYF,1 CASRGGAAGFYEQYF,1 CASSDLILYYEQYF,2 CASSDLILYYTQYF,2 CASSGSYEQYF,3 CASSGSYEQYY,3 

File two:

seq CSVGPPNNEQFF CASRGEAAGFYEQYF CASSGSYEQYY 

Output:

seq,id CSVGPPNNEQFF,0 CTVGPPNNEQFF,0 CTVGPPNNERFF,0 CASRGEAAGFYEQYF,1 RASRGEAAGFYEQYF,1 CASRGGAAGFYEQYF,1 CASSGSYEQYF,3 CASSGSYEQYY,3 

I have tried:

df3 = df1.merge(df2.groupby('seq',as_index=False)[['seq']].agg(','.join),how='right') 

output:

seq,id CASRGEAAGFYEQYF,1 CASSGSYEQYY,3 CSVGPPNNEQFF,0 

Does anyone have any advice how to solve this?

2 Answers 2

2

Do you want to merge two dataframes, or just take subset of the first dataframe according to which id is included in the second dataframe (by seq)? Anyway, this gives the required result.

df1 = pd.DataFrame({ 'seq': [ 'CSVGPPNNEQFF', 'CTVGPPNNEQFF', 'CTVGPPNNERFF', 'CASRGEAAGFYEQYF', 'RASRGEAAGFYEQYF', 'CASRGGAAGFYEQYF', 'CASSDLILYYEQYF', 'CASSDLILYYTQYF', 'CASSGSYEQYF', 'CASSGSYEQYY' ], 'id': [0, 0, 0, 1, 1, 1, 2, 2, 3, 3] }) df2 = pd.DataFrame({ 'seq': [ 'CSVGPPNNEQFF', 'CASRGEAAGFYEQYF', 'CASSGSYEQYY' ] }) df3 = df1.loc[df1['id'].isin(df1['id'][df1['seq'].isin(df2['seq'])])] 

Explanation: df1['id'][df1['seq'].isin(df2['seq'])] takes those values of id from df1 that contain at least one seq that is included in df2. Then all rows with those values of id are taken from df1.

Sign up to request clarification or add additional context in comments.

Comments

0

You can use the isin() pandas method, code shall looks as follow :

df1.loc[df1['seq'].isin(df2['seq'])] 

Assuming, both objects are pandas dataframe and 'seq' is a column.

2 Comments

The output of this command is exactly like what I had with the df.merge command.
My bad, I didn't get the point about 'id' in your question. The other answer given is the right one.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.