1

I need to find if the last value of dataframe['position'] is different from 0, then count the previous (so in reverse) values until them changes and store the counted index before the change in a variable, this without for loops. By loc or iloc for example...

dataframe: | position | 0 1 1 0 2 1 <4 3 1 <3 4 1 <2 5 1 <1 count = 4 

I achieved this by a for loop, but I need to avoid it:

count = 1 if data['position'].iloc[-1] != 0: for i in data['position']: if data['position'].iloc[-count] == data['position'].iloc[-1]: count = count + 1 else: break if data['position'].iloc[-count] != data['position'].iloc[-1]: count = count - 1 
1
  • df['col1'][::-1].cumprod().sum() Commented Sep 29, 2022 at 2:14

2 Answers 2

1

You can reverse your Series, convert to boolean using the target condition (here "not equal 0" with ne), and apply a cummin to propagate the False upwards and sum to count the trailing True:

count = df.loc[::-1, 'position'].ne(0).cummin().sum() 

Output: 4

If you have multiple columns:

counts = df.loc[::-1].ne(0).cummin().sum() 

alternative

A slightly faster alternative (~25% faster), but relying on the assumptions that you have at least one zero and non duplicated indices could be to find the last zero and use indexing

m = df['position'].eq(0) count = len(df.loc[m[m].index[-1]:])-1 

Without the requirement to have at least one zero:

m = df['position'].eq(0) m = m[m] count = len(df) if m.empty else len(df.loc[m.index[-1]:])-1 
Sign up to request clarification or add additional context in comments.

Comments

1

This should do the trick:

((data.iloc[-1] != 0) & (data[::-1] == data.iloc[-1])).cumprod().sum() 

This builds a condition ((data.iloc[-1] != 0) & (data[::-1] == data.iloc[-1])) indicating whether the value in each row (counting backwards from the end) is nonzero and equals the last value. Then, the values are coerced into 0 or 1 and the cumulative product is taken, so that the first non-matching zero will break the sequence and all subsequent values will be zero. Then the flags are summed to get the count of these consecutive matched values.

Depending on your data, though, stepping iteratively backwards from the end may be faster. This solution is vectorized, but it requires working with the entire column of data and doing several computations which are the same size as the original series.

Example:

In [12]: data = pd.DataFrame(np.random.randint(0, 3, size=(10, 5)), columns=list('ABCDE')) ...: data Out[12]: A B C D E 0 2 0 1 2 0 1 1 0 1 2 1 2 2 1 2 1 0 3 1 0 1 2 2 4 1 1 0 0 2 5 2 2 1 0 2 6 2 1 1 2 2 7 0 0 0 1 0 8 2 2 0 0 1 9 2 0 0 2 1 In [13]: ((data.iloc[-1] != 0) & (data[::-1] == data.iloc[-1])).cumprod().sum() Out[13]: A 2 B 0 C 0 D 1 E 2 dtype: int64 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.