Skip to content

Conversation

@lukemanley
Copy link
Member

Improves performance when merging on sorted keys. Seeing some improvements in non-sorted keys as well by allowing left/right indexers to be None.

Scaled down version of the example in #56115:

import pandas as pd import numpy as np df1 = pd.DataFrame({"key": np.arange(0, 1_000_000, 2), "val1": 1}) df2 = pd.DataFrame({"key": np.arange(500_000, 700_000, 1), "val2": 2}) %timeit df = pd.merge_ordered(df1, df2, on="key", how="inner") # 389 ms ± 14.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) <- main # 10 ms ± 319 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) <- PR %timeit df = pd.merge(df1, df2, on="key", how="inner") # 173 ms ± 2.15 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) <- main # 10.7 ms ± 1.25 ms per loop (mean ± std. dev. of 7 runs, 100 loops each) <- pr 

ASVs:

> asv continuous -f 1.1 upstream/main merge-monotonic -b ^join_merge | Change | Before [d77d5e54] <main> | After [765c4026] <merge-monotonic> | Ratio | Benchmark (Parameter) | |----------|----------------------------|--------------------------------------|---------|---------------------------------------------------------------------------------------------| | - | 22.3±1ms | 19.9±0.7ms | 0.9 | join_merge.MergeDatetime.time_merge(('ms', 'ms'), None, False) | | - | 22.6±0.9ms | 19.0±0.5ms | 0.84 | join_merge.MergeDatetime.time_merge(('ms', 'ms'), 'Europe/Brussels', False) | | - | 20.8±0.9ms | 17.4±0.3ms | 0.84 | join_merge.MergeDatetime.time_merge(('ns', 'ms'), 'Europe/Brussels', False) | | - | 19.6±2ms | 16.2±0.1ms | 0.83 | join_merge.ConcatIndexDtype.time_concat_series('string[python]', 'monotonic', 1, True) | | - | 3.28±0.3ms | 2.71±0.06ms | 0.83 | join_merge.Merge.time_merge_dataframe_integer_key(False) | | - | 20.1±0.7ms | 16.7±0.7ms | 0.83 | join_merge.MergeDatetime.time_merge(('ns', 'ns'), None, False) | | - | 5.87±0.3ms | 4.81±0.1ms | 0.82 | join_merge.ConcatIndexDtype.time_concat_series('int64[pyarrow]', 'non_monotonic', 1, False) | | - | 14.6±0.1ms | 11.9±1ms | 0.82 | join_merge.MergeEA.time_merge('Float32', False) | | - | 14.7±0.4ms | 12.0±0.4ms | 0.82 | join_merge.MergeEA.time_merge('Int64', False) | | - | 22.0±1ms | 17.7±0.7ms | 0.8 | join_merge.MergeDatetime.time_merge(('ns', 'ms'), None, False) | | - | 13.9±0.2ms | 10.1±0.7ms | 0.73 | join_merge.MergeEA.time_merge('UInt16', False) | | - | 16.1±0.3ms | 11.7±0.08ms | 0.73 | join_merge.MergeEA.time_merge('UInt64', False) | | - | 13.5±0.05ms | 9.48±0.09ms | 0.7 | join_merge.MergeEA.time_merge('Int32', False) | | - | 15.0±0.2ms | 10.4±0.2ms | 0.7 | join_merge.MergeEA.time_merge('UInt32', False) | | - | 16.0±0.3ms | 10.7±0.5ms | 0.67 | join_merge.MergeDatetime.time_merge(('ns', 'ms'), 'Europe/Brussels', True) | | - | 14.1±0.4ms | 9.46±0.04ms | 0.67 | join_merge.MergeEA.time_merge('Int16', False) | | - | 13.4±0.3ms | 8.67±0.3ms | 0.65 | join_merge.MergeEA.time_merge('Float64', True) | | - | 13.3±0.2ms | 8.70±0.2ms | 0.65 | join_merge.MergeEA.time_merge('UInt64', True) | | - | 15.6±0.2ms | 10.0±0.3ms | 0.64 | join_merge.MergeDatetime.time_merge(('ms', 'ms'), None, True) | | - | 12.6±0.09ms | 7.98±0.6ms | 0.63 | join_merge.MergeEA.time_merge('Int64', True) | | - | 15.7±0.4ms | 9.72±0.1ms | 0.62 | join_merge.MergeDatetime.time_merge(('ms', 'ms'), 'Europe/Brussels', True) | | - | 17.5±2ms | 10.8±0.3ms | 0.61 | join_merge.MergeDatetime.time_merge(('ns', 'ms'), None, True) | | - | 15.3±0.4ms | 9.35±0.2ms | 0.61 | join_merge.MergeDatetime.time_merge(('ns', 'ns'), None, True) | | - | 12.7±0.4ms | 7.74±0.6ms | 0.61 | join_merge.MergeEA.time_merge('UInt32', True) | | - | 16.0±0.8ms | 9.25±0.1ms | 0.58 | join_merge.MergeDatetime.time_merge(('ns', 'ns'), 'Europe/Brussels', True) | | - | 12.1±0.3ms | 7.07±0.5ms | 0.58 | join_merge.MergeEA.time_merge('UInt16', True) | | - | 13.1±2ms | 7.25±0.7ms | 0.55 | join_merge.MergeEA.time_merge('Float32', True) | | - | 11.7±0.3ms | 6.41±0.06ms | 0.55 | join_merge.MergeEA.time_merge('Int16', True) | | - | 12.0±0.5ms | 6.44±0.03ms | 0.54 | join_merge.MergeEA.time_merge('Int32', True) | | - | 119±5ms | 52.4±0.1ms | 0.44 | join_merge.MergeOrdered.time_merge_ordered 
@lukemanley lukemanley added Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 16, 2023
@lukemanley lukemanley added this to the 2.2 milestone Dec 16, 2023
Copy link
Member

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRs looks good, it looks like this broke some code examples in the merge user guide though

@phofl phofl merged commit 061c2e9 into pandas-dev:main Dec 17, 2023
@phofl
Copy link
Member

phofl commented Dec 17, 2023

thx!

I am a very big fan of this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode

2 participants