Skip to content

Conversation

@KevsterAmp
Copy link
Contributor

Tests on pandas.merge

(pandas-dev) kev@mac pandas % pytest pandas/tests/reshape/merge/test_merge.py + /opt/homebrew/Caskroom/miniforge/base/envs/pandas-dev/bin/ninja [1/1] Generating write_version_file with a custom command ============================================================================================================ test session starts ============================================================================================================= platform darwin -- Python 3.10.14, pytest-8.3.2, pluggy-1.5.0 PyQt5 5.15.9 -- Qt runtime 5.15.8 -- Qt compiled 5.15.8 rootdir: /Users/kev/self/pandas configfile: pyproject.toml plugins: localserver-0.0.0, qt-4.4.0, cov-5.0.0, anyio-4.4.0, hypothesis-6.108.7, cython-0.3.1, xdist-3.6.1 collected 939 items pandas/tests/reshape/merge/test_merge.py ........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ------------------------------------------------------------------------------------------ generated xml file: /Users/kev/self/pandas/test-data.xml ------------------------------------------------------------------------------------------ ============================================================================================================ slowest 30 durations ============================================================================================================ 0.02s call pandas/tests/reshape/merge/test_merge.py::TestMergeDtypes::test_merge_ea_with_string[inner-str0] 0.02s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_merge_non_unique_indexes 0.01s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_validation 0.01s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_merge_indicator_result_integrity 0.01s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_indicator 0.01s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_merge_non_unique_index_many_to_many 0.01s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_merge_left_empty_right_notempty (23 durations < 0.005s hidden. Use -vv to show these durations.) ============================================================================================================ 939 passed in 1.37s ============================================================================================================= 

Error Reproducible Example

from datetime import datetime data_A = list() data_B = list() for i in range(10): data_A.append({ "id": i, "created_date": datetime.today(), "created_at": datetime.now(), }) data_B.append({ "id": i if i % 2 == 0 else 3*i, "created_date": datetime.today(), "created_at": datetime.now(), }) df_A = pd.DataFrame.from_dict(data_A) df_B = pd.DataFrame.from_dict(data_B) df_A.merge(df_B, how="full", on="id")
Traceback (most recent call last): File "/Users/kev/self/pandas/test.py", line 29, in <module> df_A.merge(df_B, how="full", on="id") File "/Users/kev/self/pandas/pandas/core/frame.py", line 10807, in merge return merge( File "/Users/kev/self/pandas/pandas/core/reshape/merge.py", line 355, in merge raise ValueError(f"'{how}' is not a valid Merge type ({merge_type})") ValueError: 'full' is not a valid Merge type (['left', 'right', 'inner', 'outer', 'cross']) 

Not sure if it should raise ValueError or MergeError

left._check_copy_deprecation(copy)
right_df = _validate_operand(right)
if how == "cross":
merge_type = ["left", "right", "inner", "outer", "cross"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be in _MergeOperation.

And a test is needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke - moved it to _MergeOperation and added asof as valid Merge Type.


Added test function to validate it as well. Thanks

@mroeschke mroeschke added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Error Reporting Incorrect or improved errors from pandas labels Aug 7, 2024
@KevsterAmp
Copy link
Contributor Author

@mroeschke - Added tests and added asof as valid merge type, running pytest pandas/tests/reshape/merge passes all tests.

(pandas-dev) kev@mac pandas % pytest pandas/tests/reshape/merge + /opt/homebrew/Caskroom/miniforge/base/envs/pandas-dev/bin/ninja [4/4] Linking target pandas/_libs/tslibs/offsets.cpython-310-darwin.so ============================================================================================================ test session starts ============================================================================================================= platform darwin -- Python 3.10.14, pytest-8.3.2, pluggy-1.5.0 PyQt5 5.15.9 -- Qt runtime 5.15.8 -- Qt compiled 5.15.8 rootdir: /Users/kev/self/pandas configfile: pyproject.toml plugins: localserver-0.0.0, qt-4.4.0, cov-5.0.0, anyio-4.4.0, hypothesis-6.108.7, cython-0.3.1, xdist-3.6.1 collected 1332 items pandas/tests/reshape/merge/test_join.py .......................................................................... pandas/tests/reshape/merge/test_merge.py ................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. pandas/tests/reshape/merge/test_merge_asof.py .s...................................................................................................s.....................s............................. pandas/tests/reshape/merge/test_merge_cross.py ................. pandas/tests/reshape/merge/test_merge_index_as_string.py ................................................................................ pandas/tests/reshape/merge/test_merge_ordered.py ..................... pandas/tests/reshape/merge/test_multi.py .......................................... ------------------------------------------------------------------------------------------ generated xml file: /Users/kev/self/pandas/test-data.xml ------------------------------------------------------------------------------------------ ============================================================================================================ slowest 30 durations ============================================================================================================ 0.02s call pandas/tests/reshape/merge/test_join.py::TestJoin::test_full_outer_join 0.02s call pandas/tests/reshape/merge/test_join.py::test_join_multiindex_not_alphabetical_categorical[categories0-values0] 0.02s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_merge_non_unique_indexes 0.02s call pandas/tests/reshape/merge/test_multi.py::TestMergeMulti::test_compress_group_combinations 0.02s call pandas/tests/reshape/merge/test_join.py::TestJoin::test_right_outer_join 0.01s call pandas/tests/reshape/merge/test_multi.py::TestMergeMulti::test_left_join_multi_index[True-True] 0.01s call pandas/tests/reshape/merge/test_multi.py::TestMergeMulti::test_left_join_multi_index[False-True] 0.01s call pandas/tests/reshape/merge/test_join.py::TestJoin::test_left_outer_join 0.01s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_validation 0.01s call pandas/tests/reshape/merge/test_join.py::TestJoin::test_join_many_non_unique_index 0.01s setup pandas/tests/reshape/merge/test_merge_index_as_string.py::test_merge_indexes_and_columns_lefton_righton[left_df2-right_df1-left_on3-right_on3-outer] 0.01s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_merge_indicator_result_integrity 0.01s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_indicator 0.01s call pandas/tests/reshape/merge/test_multi.py::TestMergeMulti::test_left_join_multi_index[True-False] 0.01s call pandas/tests/reshape/merge/test_join.py::TestJoin::test_inner_join 0.01s setup pandas/tests/reshape/merge/test_merge_index_as_string.py::test_merge_indexes_and_columns_lefton_righton[left_df2-right_df2-left_on1-right_on1-right] 0.01s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_merge_non_unique_index_many_to_many 0.01s call pandas/tests/reshape/merge/test_merge.py::TestMerge::test_merge_left_empty_right_notempty 0.01s call pandas/tests/reshape/merge/test_multi.py::TestMergeMulti::test_left_join_multi_index[False-False] 0.01s call pandas/tests/reshape/merge/test_join.py::TestJoin::test_join_multiindex (10 durations < 0.005s hidden. Use -vv to show these durations.) ====================================================================================================== 1329 passed, 3 skipped in 2.28s ======================================================================================================= 
@KevsterAmp KevsterAmp requested a review from mroeschke August 12, 2024 01:53
@mroeschke
Copy link
Member

@KevsterAmp KevsterAmp force-pushed the add-pandas-merge-how-param-validation branch from 7f8b512 to 86b4055 Compare August 21, 2024 12:35
@KevsterAmp KevsterAmp requested a review from mroeschke August 21, 2024 22:55
@mroeschke
Copy link
Member

pre-commit.ci autofix

@mroeschke mroeschke added this to the 3.0 milestone Aug 22, 2024
@mroeschke mroeschke merged commit 2748365 into pandas-dev:main Aug 22, 2024
@mroeschke
Copy link
Member

Thanks @KevsterAmp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode

2 participants