3

I've searched previous answers relating to this but those answers seem to utilize numpy because the array contains numbers. I am trying to search for a keyword in a sentence in a dataframe ('Timeframe') where the full sentence is 'Timeframe for wave in ____' and would like to return the column and row index. For example:

 df.iloc[34,0] 

returns the string I am looking for but I am avoiding a hard code for dynamic reasons. Is there a way to return the [34,0] when I search the dataframe for the keyword 'Timeframe'

3
  • You can access the corresponding row by using df.index.get_loc as explained in the target. Commented Jul 11, 2017 at 18:38
  • @ayhan - I reopen it, because it seems get_loc is not solution. Commented Jul 11, 2017 at 18:42
  • @jezrael Yes, you are right. Commented Jul 11, 2017 at 18:45

2 Answers 2

4

EDIT:

For check index need contains with boolean indexing, but then there are possible 3 values:

df = pd.DataFrame({'A':['Timeframe for wave in ____', 'a', 'c']}) print (df) A 0 Timeframe for wave in ____ 1 a 2 c def check(val): a = df.index[df['A'].str.contains(val)] if a.empty: return 'not found' elif len(a) > 1: return a.tolist() else: #only one value - return scalar return a.item() 
print (check('Timeframe')) 0 print (check('a')) [0, 1] print (check('rr')) not found 

Old solution:

It seems you need if need numpy.where for check value Timeframe:

df = pd.DataFrame({'A':list('abcdef'), 'B':[4,5,4,5,5,4], 'C':[7,8,9,4,2,'Timeframe'], 'D':[1,3,5,7,1,0], 'E':[5,3,6,9,2,4], 'F':list('aaabbb')}) print (df) A B C D E F 0 a 4 7 1 5 a 1 b 5 8 3 3 a 2 c 4 9 5 6 a 3 d 5 4 7 9 b 4 e 5 2 1 2 b 5 f 4 Timeframe 0 4 b a = np.where(df.values == 'Timeframe') print (a) (array([5], dtype=int64), array([2], dtype=int64)) b = [x[0] for x in a] print (b) [5, 2] 
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks jezrael! one thing I didn't realize would be more problematic is that Timeframe is not the whole word. I initially tried doing a df[df == 'Timeframe'] kind of search expecting it to locate the first instance of Timeframe. However since Timeframe is part of a sentence it will not return a result. The full sentence is 'Timeframe for wave in____ ' do you have any tips?
Hmm, and output is positions? Or output is all sentence?
since my df is only one column, i would just need to return the row where the sentence 'Timeframe for wave in___' appears. Does that help clarify? So in my example df.iloc[34,0] actually corresponds to where the sentence appears
3

In case you have multiple columns where to look into you can use following code example:

import numpy as np import pandas as pd df = pd.DataFrame([[1,2,3,4],["a","b","Timeframe for wave in____","d"],[5,6,7,8]]) mask = np.column_stack([df[col].str.contains("Timeframe", na=False) for col in df]) find_result = np.where(mask==True) result = [find_result[0][0], find_result[1][0]] 

Then output for df and result would be:

>>> df 0 1 2 3 0 1 2 3 4 1 a b Timeframe for wave in____ d 2 5 6 7 8 >>> result [1, 2] 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.