I have a table like this:
| event | value | time |
|---|---|---|
| seed | 57 | 2021-08-01 09:49:23 |
| ghy | 869 | 2021-08-02 09:50:12 |
| repo | 5324 | 2021-09-03 10:49:23 |
| repo | null | 2021-09-03 11:49:23 |
| harv | 12 | 2021-09-05 09:43:23 |
| weig | 5,37,12 | 2021-09-06 09:25:12 |
| repo | null,null,4,8 | 2021-09-07 09:12:23 |
| repo | 4,8,null,null | 2021-09-07 10:49:23 |
| repo | null,null,4,8 | 2021-09-08 17:49:23 |
| repo | 4,8,1,3 | 2021-09-09 12:12:23 |
| repo | 1356 | 2021-09-10 12:49:23 |
Sometimes the value column has the following pattern: null, null, x, y, where x and y are any natural numbers.
Do you know how to delete all pairs of records from the diagram: x, y, null, null and then again null, null, x, y immediately after the first occurrence of such a pattern?
I mean the expected output should be:
| event | value | time |
|---|---|---|
| seed | 57 | 2021-08-01 09:49:23 |
| ghy | 869 | 2021-08-02 09:50:12 |
| repo | 5324 | 2021-09-03 10:49:23 |
| repo | null | 2021-09-03 11:49:23 |
| harv | 12 | 2021-09-05 09:43:23 |
| weig | 5,37,12 | 2021-09-06 09:25:12 |
| repo | null,null,4,8 | 2021-09-07 09:12:23 |
| repo | 4,8,1,3 | 2021-09-09 12:12:23 |
| repo | 1356 | 2021-09-10 12:49:23 |
When according to one of the answers, I use it:
import numpy as np df['value'] = df['value'].apply(lambda x : ','.join(np.sort(x.split(',')))) df.drop_duplicates(['value'], keep='first') I get:
| event | value | time |
|---|---|---|
| seed | 57 | 2021-08-01 09:49:23 |
| ghy | 869 | 2021-08-02 09:50:12 |
| repo | 5324 | 2021-09-03 10:49:23 |
| repo | null | 2021-09-03 11:49:23 |
| harv | 12 | 2021-09-05 09:43:23 |
| weig | 12,37,5 | 2021-09-06 09:25:12 |
| repo | 4,8,null,null | 2021-09-07 09:12:23 |
| repo | 4,8,1,3 | 2021-09-09 12:12:23 |
| repo | 1356 | 2021-09-10 12:49:23 |
Some of the values in the 'value' column change their positions (see bold).
Do you have an idea how to fix it?
null,null,4,8,4,8,null,nullandnull,null,4,8has same numbers, here4,8? Ifnull,null,4,8,1,0,null,nulland4,8,null,nullis removed only last4,8,null,null?value columnis list or string?valueis a string