1

I try to resolve the following problem. I have two data sets, say df1 and df2:

df1 NameSP Val Char1 BVA 0 'ACCR' 0.091941 A Y' 1 'SDRE' 0.001395 S Y' 2 'ACUZ' 0.121183 A N' 3 'SRRE' 0.001512 S N' 4 'FFTR' 0.035609 F N' 5 'STZE' 0.000637 S N' 6 'AHZR' 0.001418 A Y' 7 'DEES' 0.000876 D N' 8 'UURR' 0.023878 U Y' 9 'LLOH' 0.004371 L Y' 10 'IUUT' 0.049102 I N' df2 NameSP Val1 Glob 0 'ACCR' 0.234 20000 1 'FFTR' 0.222 10000 2 'STZE' 0.001 5000 3 'DEES' 0.006 2000 4 'UURR' 0.134 20000 5 'LLOH' 0.034 10000 

I would like to perform indexing of df2 in df1, and then use the indexing vector for various matrix operation. This would be something similar to strmatch(A,B,'exact') in Matlab. I can get the indexing properly by using .iloc and then .isin as in the following code:

import pandas as pd import numpy as np df1 = pd.read_excel('C:\PYTHONCODES\LINEAROPT\TEST_DATA1.xlsx') df2 = pd.read_excel('C:\PYTHONCODES\LINEAROPT\TEST_DATA2.xlsx') print(df1) print(df2) ddf1 = df1.iloc[:,0] ddf2 = df2.iloc[:,0] pindex = ddf1[ddf1.isin(ddf2)] print(pindex.index) 

which gives me:

Int64Index([0, 4, 5, 7, 8, 9], dtype='int64') 

But I can not find the way to use this index for mapping and building my arrays. As an example, I would like to have a vector that has the same number of elements that df1, but with Val1 values from df2 at indexed positions and zeros everywhere else. So it should look like that:

0.234 0 0 0 0.222 0.001 0 0.006 0.134 0.034 0 

Or another mapping problem. How to use such indexing to map the values from colon "Val" in df1 in a vector that would contain Val from df1 at indexed rows and zeros everywhere else. So this time it should look like:

0.091941 0.0 0.0 0.0 0.035609 0.000637 0.0 0.000876 0.023878 0.004371 0.0 

Any idea of how to that in efficient and elegant way?

Thanks for help!

1 Answer 1

1

First problem

df2.set_index('NameSP')['Val1'].reindex(df1['NameSP']).fillna(0) 

Second problem

df1['Val1'].where(df1['NameSP'].isin(df2['NameSP']), 0) 
Sign up to request clarification or add additional context in comments.

2 Comments

Fantastic! This works great for the first problem. Thanks. For the second one it's perfect too. Thanks so much!
len of df1 or df2 for second problem?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.