String Indexing in dataframe subset - pandas

Question

I'm trying to create a subset of a pandas dataframe, based on values in a list. However, I need to include string indexing. I'll demonstrate with an example:

Here is my dataframe:

df = pd.DataFrame({'A' : ['1-2', '2', '3', '3-8', '4']})

Here is what it looks like:

A 0 1-2 1 2 2 3 3 3-8 4 4

I have a list of values I want to use to select rows from my dataframe.

list1 = ['2', '3']

I can use the .isin() function to select rows from my dataframe using my list items.

subset = df[df['A'].isin(list1)] print(subset) A 1 2 2 3

However, I want any value that includes '2' or '3'. This is my desired output:

 A 1 1-2 2 2 3 3 4 3-8

Can I use string indexing in my .isin() function? I am struggling to come up with another workaround.

BENY · Accepted Answer · 2019-10-29 19:07:48Z

3

Check str.split with isin and any

Newdf=df[df.A.str.split('-',expand=True).isin(['2','3']).any(1)].copy() Out[189]: A 0 1-2 1 2 2 3 3 3-8

answered Oct 29, 2019 at 19:07

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Erich Purpur Over a year ago

what does .any() do? More specifically, the argument (1) in .any(1).

BENY Over a year ago

any True per row @ErichPurpur

Georgina Skibinski · Accepted Answer · 2019-10-29 19:18:56Z

You can try with regular expression:

import re pattern=re.compile(".*(("+(")|(").join(list1)+"))") print(df.loc[df['A'].apply(lambda x: True if pattern.match(x) else False)])

Output:

A 0 1-2 1 2 2 3 3 3-8 [Program finished]

Collectives™ on Stack Overflow

String Indexing in dataframe subset - pandas

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related