1

In column 'a' I have values which are numbers separated by a comma (ranging from 1 to 35). e.g. '1,6,7,3,5,15,6,25,30' and '5,6,7,33' '1,6,29,15'

In a new column 'b', I want the value to say 'yes' whenever the value in column A is 5 or it's variations ,5 (comma 5) or 5, (5 comma). However I don't want values such as 15 or 25 included. Is there a way to include all combinations of 5 with a comma but not anything else?

df.loc[df['a'].str.contains(',5'), 'b'] = 'yes' df.loc[df['a'].str.contains('5,'), 'b'] = 'yes' 
8
  • ^5$|^5,|,5,|,5$ seems to cover all the bases. regex101.com/r/NsqaP6/1 Commented Jan 14, 2020 at 13:05
  • 1
    what if value contains both 5 and 15? like 5,2,15,5 Commented Jan 14, 2020 at 13:08
  • 2
    @jofrev The goal is a boolean check for existence, not extract all occurrences. Commented Jan 14, 2020 at 13:31
  • 1
    @jofrev No worries. As with any regex question posted on this site, very seldom is the poster aware of how their choice of words affects the solution presented. Most questions have a lot of incompatible puffery which makes the question unanswerable on the first try. At any rate, your regex would also work for the goal at hand. Commented Jan 14, 2020 at 13:40
  • 1
    How about \b5\b? Commented Jan 14, 2020 at 14:07

2 Answers 2

1

I would suggest converting your comma-separated string into an array (see here for how: How to convert a string to a list in Python?).

Then you can check if the search value (e.g. '5') exists in the array using in, e.g.:

if searchValue in arrayOfNumbers { return True }

(Or you could try a ternary operator, however that's done in Python)

Sign up to request clarification or add additional context in comments.

1 Comment

That sounds pretty slow
1

I would suggest something like:

# your dataframe df = pandas.DataFrame({'A': ['1,2,34,5,6', '32,2,4,67,5', '4,3,2,1,']}) df['B'] = df['A'].apply(lambda x : True if '5' in x.split(',') else False) 

this will add a column B to your dataframe containing True if 5 is there and False otherwise.

 A B 0 1,2,34,5,6 True 1 32,2,4,67,5 True 2 4,3,2,1, False 

2 Comments

I think this will include 15 and 25, which I want to avoid
if you try True if '5' in ['15', '25'] else False you'll see that the above solution is correct. The split does the trick of converting your string in a list of strings.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.