Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

2
  • yes, new_df = new_df.append(temp) is very inefficient. It makes your algorithm quadratic time, pandas.Dataframe.append always creates whole-new dataframe. The most efficient way would probably be to make 'customer id' column an index and simply select with your list Commented Jul 11, 2020 at 23:40
  • I simulated with 3 million record and 100000 customer ids. It takes only a few seconds with isin. Commented Jul 12, 2020 at 0:23