Skip to content

Conversation

@TomAugspurger
Copy link
Contributor

Closes #16519

@TomAugspurger TomAugspurger added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Blocker Blocking issue or pull request for an upcoming release Needs Backport labels May 30, 2017
@TomAugspurger TomAugspurger added this to the 0.20.2 milestone May 30, 2017
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

- Bug in using ``pathlib.Path`` or ``py.path.local`` objects with io functions (:issue:`16291`)
- Bug in ``DataFrame.update()`` with ``overwrite=False`` and ``NaN values`` (:issue:`15593`)

- Bug in :func:`pd.unique` on an array of tuples (:issue:`16519`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think has to be :func:`unique` ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct.

np.array([False, False, False]))

@pytest.mark.parametrize('arr, unique', [
([(0, 0), (0, 1), (1, 0), (1, 1), (0, 0), (0, 1), (1, 0), (1, 1)],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add an example of this in pd.unique itself.

@jreback
Copy link
Contributor

jreback commented May 30, 2017

I with agree @chris-b1 comment, yes that looks right.

@codecov
Copy link

codecov bot commented May 30, 2017

Codecov Report

Merging #16543 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #16543 +/- ## ======================================= Coverage 90.79% 90.79% ======================================= Files 161 161 Lines 51063 51063 ======================================= Hits 46365 46365 Misses 4698 4698
Flag Coverage Δ
#multiple 88.63% <100%> (ø) ⬆️
#single 40.15% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/algorithms.py 94.41% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e60dc4c...35da5b9. Read the comment docs.

@codecov
Copy link

codecov bot commented May 30, 2017

Codecov Report

Merging #16543 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #16543 +/- ## ======================================= Coverage 90.79% 90.79% ======================================= Files 161 161 Lines 51063 51063 ======================================= Hits 46365 46365 Misses 4698 4698
Flag Coverage Δ
#multiple 88.63% <100%> (ø) ⬆️
#single 40.15% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/algorithms.py 94.41% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e60dc4c...35da5b9. Read the comment docs.

@codecov
Copy link

codecov bot commented May 30, 2017

Codecov Report

Merging #16543 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #16543 +/- ## ======================================= Coverage 90.75% 90.75% ======================================= Files 161 161 Lines 51074 51074 ======================================= Hits 46353 46353 Misses 4721 4721
Flag Coverage Δ
#multiple 88.59% <100%> (ø) ⬆️
#single 40.16% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/algorithms.py 94.41% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 03d44f3...eb7c18f. Read the comment docs.

@TomAugspurger
Copy link
Contributor Author

I think you could back out this change from #16434

The regression test from #16434 fails if I revert the change. The difference being

(Pdb++) values # this is with the fix reverted array([[1, 'a']], dtype=object) (Pdb++) lib.list_to_object_array(list([(1, 'a')])) # this is the fix from 16434 array([(1, 'a')], dtype=object)

So an array of lists vs. an array of tuples. Is that correct?

@jreback
Copy link
Contributor

jreback commented May 31, 2017

you may just need to call
_ensure_arraylike there instead of the isinstance check

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented May 31, 2017

Using _ensure_arraylike failed on an empty array pd.Series([1, 2]).isin([]) since it's a float instead of object dtype, so the hashing fails later on. I can handle that case if you want, or just leave the isinstance checks. Not sure which is cleaner really.

@jreback
Copy link
Contributor

jreback commented May 31, 2017

@TomAugspurger added a commit. should fix up I think.

@jreback
Copy link
Contributor

jreback commented May 31, 2017

@TomAugspurger looks this broke it. ok just revert my commit and merge your changes. This is a very touchy area. I we are doing the right things in the tests, but just tricky to get exactly right.

@jreback
Copy link
Contributor

jreback commented May 31, 2017

@TomAugspurger rebased to remove my commit.

@TomAugspurger
Copy link
Contributor Author

@jreback are you able to restart appveyor jobs? One of the network tests failed on the first job.

@jreback jreback merged commit 9d7afa7 into pandas-dev:master Jun 1, 2017
@jreback
Copy link
Contributor

jreback commented Jun 1, 2017

thanks!

TomAugspurger added a commit to TomAugspurger/pandas that referenced this pull request Jun 1, 2017
TomAugspurger added a commit that referenced this pull request Jun 4, 2017
@TomAugspurger TomAugspurger deleted the unique-tuples branch June 4, 2017 20:29
Kiv pushed a commit to Kiv/pandas that referenced this pull request Jun 11, 2017
stangirala pushed a commit to stangirala/pandas that referenced this pull request Jun 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Blocker Blocking issue or pull request for an upcoming release

3 participants