Skip to content

Conversation

@mroeschke
Copy link
Member

xref #57534 (comment)

is_range_indexer lets us short circuit unlike checking the remainder

In [1]: from pandas import *; import numpy as np ...: np.random.seed(123) ...: size = 1_000_000 ...: ngroups = 1000 ...: data = Series(np.random.randint(0, ngroups, size=size)) + /opt/miniconda3/envs/pandas-dev/bin/ninja [1/1] Generating write_version_file with a custom command In [2]: %timeit data.groupby(data).groups 16.8 ms ± 183 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) # PR In [3]: %timeit data.groupby(data).groups 17.6 ms ± 114 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) # main
@mroeschke mroeschke added Performance Memory or execution speed performance Index Related to the Index class or subclasses labels Feb 21, 2024
@mroeschke mroeschke added this to the 3.0 milestone Feb 21, 2024
@mroeschke mroeschke requested a review from phofl February 21, 2024 23:28
@phofl phofl merged commit 1debaf3 into pandas-dev:main Feb 22, 2024
@phofl
Copy link
Member

phofl commented Feb 22, 2024

@mroeschke mroeschke deleted the perf/rng/shallow_copy2 branch February 22, 2024 16:35
pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Index Related to the Index class or subclasses Performance Memory or execution speed performance

2 participants