4

I have a data set (sample) like below

Date Value 2019-05-01 0 2019-05-02 0 2019-05-03 0 2019-05-04 0 2019-05-05 0 2019-05-06 0 2019-05-07 0 2019-05-08 1 2019-05-09 0 

I want to transform it such that, if I encounter Value=1, then I take the 3 values from 2 days before and fill it as 1. Also set the current value to be 0. In other words, the transformed data set should look like this

Date Value 2019-05-01 0 2019-05-02 0 2019-05-03 1 2019-05-04 1 2019-05-05 1 2019-05-06 0 2019-05-07 0 2019-05-08 0 2019-05-09 0 

Do notice, that in the example above, 2019-05-08 was set to 0 after transformation, and 2019-05-03 to 2019-05-05 was set to 1 (last value set to 1 is 2 days before 2019-05-08 and 3 days preceding 2019-05-05 is also set to 1). If two consecutive values show up as 1, we start the date calculation from the last value that shows up as 1. I think I can do this via for loops, but was looking to see if any inbuilt functions can help me with this. Thanks!

3 Answers 3

1

There could be more precise ways of solving this problem. However, I could only think of solving this using the index values(say i) where Value==1 and then grab the index values at preceding locations(2 dates before means i-3 and then two more values above it means i-4, i-5) and assign the Value to 1. Finally, set the Value back to 0 for the index location(s) that were originally found for Value==1.

In [53]: df = pd.DataFrame({'Date':['2019-05-01','2019-05-02', '2019-05-03','2019-05-04','2019-05-05', '2019-05-06','20 ...: 19-05-07','2019-05-08','2019-05-09'], 'Value':[0,0,0,0,0,0,0,1,0]}) ...: ...: In [54]: val_1_index = df.loc[df.Value == 1].index.tolist() In [55]: val_1_index_decr = [(i-3, i-4, i-5) for i in val_1_index] In [56]: df.loc[df['Value'].index.isin([i for i in val_1_index_decr[0]]), 'Value'] = 1 In [57]: df.loc[df['Value'].index.isin(val_1_index), 'Value'] = 0 In [58]: df Out[58]: Date Value 0 2019-05-01 0 1 2019-05-02 0 2 2019-05-03 1 3 2019-05-04 1 4 2019-05-05 1 5 2019-05-06 0 6 2019-05-07 0 7 2019-05-08 0 8 2019-05-09 0 
Sign up to request clarification or add additional context in comments.

Comments

1

A one line solution, assuming that df is your original dataframe:

df['Value'] = pd.Series([1 if 1 in df.iloc[i+3:i+6].values else 0 for i in df.index]) 

Here I work on index rather than dates, so I assume that you have one day per row and days are consecutive as shown in your example.

To fit also for this request:

If two consecutive values show up as 1, we start the date calculation from the last value that shows up as 1.

I can propose a two line solution:

validones = [True if df.iloc[i]['Value'] == 1 and df.iloc[i+1]['Value'] == 0 else False for i in df.index] df['Value'] = pd.Series([1 if any(validones[i+3:i+6]) else 0 for i in range(len(validones))]) 

Basically first I build a list of boolean to check if the 1 in df['Value'] is not followed by another 1 and use this boolean list to perform the substitutions.

Comments

0

No sure about the efficiency of this solution because one needs to create three new columns but this also works:

df['shiftedValues'] = \ df['Value'].shift(-3, fill_value=0) + \ df['Value'].shift(-4, fill_value=0) + \ df['Value'].shift(-5, fill_value=0) 

Note that the shift is done by row and not by day.

To shift by actual days I would first index by dates

df['Date'] = pd.to_datetime(df['Date']) df = df.set_index('Date') df['shiftedValues'] = \ df['Value'].shift(-3, freq='1D', fill_value=0).asof(df.index) + \ df['Value'].shift(-4, freq='1D', fill_value=0).asof(df.index) + \ df['Value'].shift(-5, freq='1D', fill_value=0).asof(df.index) # Out: # Value shiftedValues # Date # 2019-05-01 0 0.0 # 2019-05-02 0 0.0 # 2019-05-03 0 1.0 # 2019-05-04 0 1.0 # 2019-05-05 0 1.0 # 2019-05-06 0 0.0 # 2019-05-07 0 0.0 # 2019-05-08 1 0.0 # 2019-05-09 0 0.0 

Now this works correctly for dates, for instance if df is (note the missing and repeated days)

 Date Value 0 2019-05-01 0 1 2019-05-02 0 2 2019-05-03 0 3 2019-05-04 0 4 2019-05-05 0 5 2019-05-05 0 6 2019-05-07 0 7 2019-05-08 1 8 2019-05-09 0 

then you get

 Value shiftedValues Date 2019-05-01 0 0.0 2019-05-02 0 0.0 2019-05-03 0 1.0 2019-05-04 0 1.0 2019-05-05 0 1.0 2019-05-05 0 1.0 2019-05-07 0 0.0 2019-05-08 1 0.0 2019-05-09 0 0.0 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.