1

I am dealing with a pandas data frame as shown below.

 id x1 y1 0 2 some_val some_val 1 2 some_val some_val 2 2 some_val some_val 3 2 some_val some_val 4 2 some_val some_val 5 0 0 0 6 3 some_val some_val 7 3 some_val some_val 8 0 0 0 9 5 some_val some_val 10 5 some_val some_val 11 5 some_val some_val 12 0 0 0 13 6 some_val some_val 14 6 some_val some_val 15 6 some_val some_val 16 6 some_val some_val 

My original data frame was the data frame without the rows with all '0' values. As per the project requirement I had to insert the rows with all 0's value whenever the "id" changes.

Now I want to delete all the rows of any "id" which has 3 and less than 3 rows. From the above data frame, I would want to delete all the respective rows of id- "3" and "5" . My resultant data frame should look like below:

 id x1 y1 0 2 some_val some_val 1 2 some_val some_val 2 2 some_val some_val 3 2 some_val some_val 4 2 some_val some_val 5 0 0 0 6 6 some_val some_val 7 6 some_val some_val 8 6 some_val some_val 9 6 some_val some_val 

Kindly suggest me a way to obtain this result.

1
  • In your requirements you specify "has 3 and less than 3 rows"... what does this mean? Commented Mar 23, 2017 at 19:59

3 Answers 3

2

The simplest answer is to remove the zero rows because they may get in the way of the calculation if you have more than 3 of them. then do a group by. then filter. then add back zeros like you did in other question/answer

d1 = df.query('ProjID != 0').groupby('ProjID').filter(lambda df: len(df) > 3) d1 ProjID Xcoord Ycoord 0 2 -7.863509 5.221327 1 2 some_val some_val 2 2 some_val some_val 3 2 some_val some_val 4 2 some_val some_val 13 6 some_val some_val 14 6 some_val some_val 15 6 some_val some_val 16 6 some_val some_val 

Then add back

pidv = d1.ProjID.values pid_chg = np.append(pidv[:-1] != pidv[1:], True) i = d1.index.repeat(pid_chg + 1) d2 = d1.loc[i, :].copy() d2.loc[i.duplicated()] = 0 d2.reset_index(drop=True) ProjID Xcoord Ycoord 0 2 -7.863509 5.221327 1 2 some_val some_val 2 2 some_val some_val 3 2 some_val some_val 4 2 some_val some_val 5 0 0 0 6 6 some_val some_val 7 6 some_val some_val 8 6 some_val some_val 9 6 some_val some_val 10 0 0 0 
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much !! You are a life saver.
1

Say your DataFrame name is df, you need to do the following:

df = df[df['col'<>=condition]] 

Specifically to your case:

df = df[df['ProjID'!=3]] 

Same with 5. You can combine both filters with an 'and' for efficiency.

This is called DataFrame indexing filters.

5 Comments

Regarding your other question, I am a bit confused by it: do you want to delete all 3=> rows as well, or you just want to assign 0 to them?
Thank You. I have tried this, but the problem is the data frame is very huge, with thousands of rows, I cant manually search for the specific ProjID with <=3 rows and delete them. Can you suggest a way where it iterates over the whole df ?
I have to assign a row with 0 values, whenever the ProjID changes. I think, I should do this part once I have dropped the unwanted rows.
The method I proposed does exactly what you want. The "!=3" is a condition, is not an index . This means that when applied, it will loop through the DataFrame and deletes (actually reassigns) any row that has 'ProjID' equal to 3. For example, df = df[df['ProjID'>3]] Will filter out all the 'ProjID' rows that are equal or less than 3.
Regarding your other question, it is better to re-assign zero at the same time. Well, to be more specific, you want to pick up all the rows that satisfy your condition (in this case df[df['ProjID'<=3]] ) , and then replace their values with zeros. This is best done with a lambda function and 'apply': df = df[df.ProjID.apply(lambda x: x if x>3 else 0)] Here is a relevant discussion: stackoverflow.com/questions/11869910/…
0

You can use groupby and filter the IDs with count less than three and use the resulting list to index the df.

filtered = df.groupby('ProjID').Xcoord.filter(lambda x: x.count() > 3) df.iloc[filtered.index.tolist()] ProjID Xcoord Ycoord 0 2 -7.863509 5.221327 1 2 some_val some_val 2 2 some_val some_val 3 2 some_val some_val 4 2 some_val some_val 13 6 some_val some_val 14 6 some_val some_val 15 6 some_val some_val 16 6 some_val some_val 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.