How do i find the iloc of a row in pandas dataframe?

Question

I have an indexed pandas dataframe. By searching through its index, I find a row of interest. How do I find out the iloc of this row?

Example:

dates = pd.date_range('1/1/2000', periods=8) df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D']) df A B C D 2000-01-01 -0.077564 0.310565 1.112333 1.023472 2000-01-02 -0.377221 -0.303613 -1.593735 1.354357 2000-01-03 1.023574 -0.139773 0.736999 1.417595 2000-01-04 -0.191934 0.319612 0.606402 0.392500 2000-01-05 -0.281087 -0.273864 0.154266 0.374022 2000-01-06 -1.953963 1.429507 1.730493 0.109981 2000-01-07 0.894756 -0.315175 -0.028260 -1.232693 2000-01-08 -0.032872 -0.237807 0.705088 0.978011 window_stop_row = df[df.index < '2000-01-04'].iloc[-1] window_stop_row Timestamp('2000-01-08 00:00:00', offset='D') #which is the iloc of window_stop_row?

@filmor: smth like window_start_row = values[values.Timestamp < row.Timestamp - window_length][-1]. I need the iloc of window_start_row — lmsasu
– lmsasu, Commented Jan 20, 2016 at 10:10
@anton: I found the whole row. I need some values from its cells, but i also need its iloc. — lmsasu
– lmsasu, Commented Jan 20, 2016 at 10:11

Skippy le Grand Gourou · Accepted Answer · 2024-10-25 09:10:14Z

Generally speaking, pass the named index value to index.get_loc:

df.index.get_loc(row_of_interest_named_index)

Since you’re dealing with dates it may be more convenient to retrieve the index value with .name:

In [131]: dates = pd.date_range('1/1/2000', periods=8) df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D']) df Out[131]: A B C D 2000-01-01 0.095234 -1.000863 0.899732 -1.742152 2000-01-02 -0.517544 -1.274137 1.734024 -1.369487 2000-01-03 0.134112 1.964386 -0.120282 0.573676 2000-01-04 -0.737499 -0.581444 0.528500 -0.737697 2000-01-05 -1.777800 0.795093 0.120681 0.524045 2000-01-06 -0.048432 -0.751365 -0.760417 -0.181658 2000-01-07 -0.570800 0.248608 -1.428998 -0.662014 2000-01-08 -0.147326 0.717392 3.138620 1.208639 In [133]: window_stop_row = df[df.index < '2000-01-04'].iloc[-1] window_stop_row.name Out[133]: Timestamp('2000-01-03 00:00:00', offset='D') In [134]: df.index.get_loc(window_stop_row.name) Out[134]: 2

get_loc returns the ordinal position of the label in your index which is what you want:

In [135]: df.iloc[df.index.get_loc(window_stop_row.name)] Out[135]: A 0.134112 B 1.964386 C -0.120282 D 0.573676 Name: 2000-01-03 00:00:00, dtype: float64

if you just want to search the index then so long as it is sorted then you can use searchsorted:

In [142]: df.index.searchsorted('2000-01-04') - 1 Out[142]: 2

@lmsasu which bit? the .name attribute is not so obvious but the rest is reasonable to me
@lmsasu if you want to search the index, then so long as the index is sorted then this would give you what you want: In [138]: df.index.searchsorted('2000-01-04') - 1 Out[138]: 2
I would have loved to use this, but I get AttributeError: 'DataFrame' object has no attribute 'name'. stackoverflow.com/questions/56214275/…
IMO, the method ought to be named get_iloc, as it's the integer location that it's returning. The name get_loc seems incorrect.

ascripter · Accepted Answer · 2018-03-01 16:35:51Z

While pandas.Index.get_loc() will only work if you have a single key, the following paradigm will also work getting the iloc of multiple elements:

np.argwhere(condition).flatten() # array of all iloc where condition is True

In your case, picking the latest element where df.index < '2000-01-04':

np.argwhere(df.index < '2000-01-04').flatten()[-1] # returns 2

Anton Protopopov · Accepted Answer · 2016-01-20 10:35:01Z

IIUC you could call index for your case:

In [53]: df[df.index < '2000-01-04'].index[-1] Out[53]: Timestamp('2000-01-03 00:00:00', offset='D')

EDIT

I think @EdChums answer is what you want. Alternatively you could filter your dataframe with values which you get, then use all to find the row with that values and then pass it to the index:

In [67]: df == window_stop_row Out[67]: A B C D 2000-01-01 False False False False 2000-01-02 False False False False 2000-01-03 True True True True 2000-01-04 False False False False 2000-01-05 False False False False 2000-01-06 False False False False 2000-01-07 False False False False 2000-01-08 False False False False In [68]: (df == window_stop_row).all(axis=1) Out[68]: 2000-01-01 False 2000-01-02 False 2000-01-03 True 2000-01-04 False 2000-01-05 False 2000-01-06 False 2000-01-07 False 2000-01-08 False Freq: D, dtype: bool In [69]: df.index[(df == window_stop_row).all(axis=1)] Out[69]: DatetimeIndex(['2000-01-03'], dtype='datetime64[ns]', freq='D')

Thanks. However, I would like to get the answer "2", the iloc of this line. This is the question for me - how do i get this number?
After Anton's edit: it would be nice to make use of the index and search based on it. It is expected that index-based searching works faster.

Tom Patel · Accepted Answer · 2016-01-20 10:25:54Z

You could try looping through each row in the dataframe:

 for row_number,row in dataframe.iterrows(): if row['column_header'] == YourValue: print row_number

This will give you the row with which to use the iloc function

Collectives™ on Stack Overflow

How do i find the iloc of a row in pandas dataframe?

4 Answers 4

6 Comments

Comments

3 Comments

1 Comment

Linked

Hot Network Questions