-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Adding Multiindex support to dataframe pivot function(Fixes #21425) #21467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@ ## master #21467 +/- ## ========================================== + Coverage 91.9% 91.9% +<.01% ========================================== Files 154 154 Lines 49657 49667 +10 ========================================== + Hits 45638 45648 +10 Misses 4019 4019
Continue to review full report at Codecov.
|
| @NikhilKumarM : Off to a good start here! Going to need a |
| Thanks @gfyoung . I have added test cases and whatsnew entry |
| @NikhilKumarM : You actually need to add the |
| @NikhilKumarM Did you push your new changes? Because they do not show up on github here. Normally you only need to add new commits to the same branch, push that to github, and the PR here should get updated. |
| @jorisvandenbossche : If you look at the PR description, you can see that @NikhilKumarM accidentally added them there. If this PR goes unresponsive, I can bring it to the finish line. |
| Hello @NikhilKumarM! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on July 01, 2018 at 13:52 Hours UTC |
gfyoung left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! LGTM!
jorisvandenbossche left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can you also add the ability to pass multiple values for
columns? - Can you also update the docstring of
DataFrame.pivot?
doc/source/whatsnew/v0.23.2.txt Outdated
| **Reshaping** | ||
| | ||
| - | ||
| - Bug in: DataFrame.pivot() function where error was raised when multiple columns are set as index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this to the 0.24.0.txt file? And you can mention it in the Enhancements section
(as this is an new feature, not really a bug fix, I think, as it was just not yet implemented)
pandas/core/reshape/reshape.py Outdated
| cols = [columns] if index is None else [index, columns] | ||
| append = index is None | ||
| indexed = self.set_index(cols, append=append) | ||
| # adding the support for multi-index in pivot function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is not really needed I think
pandas/tests/reshape/test_pivot.py Outdated
| expected = concat([means, stds], keys=['mean', 'std'], axis=1) | ||
| tm.assert_frame_equal(result, expected) | ||
| | ||
| # adding the test case for multiple columns as index (#21425) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move this comment to the first line of the test function
pandas/tests/reshape/test_pivot.py Outdated
| expected = DataFrame([[0, 1], [2, 3], [4, 5], [6, 7]], | ||
| exp_index, | ||
| exp_columns) | ||
| tm.assert_frame_equal(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an example where the values argument is not mentioned? eg df.pivot(index=['lev1', 'lev2'], columns='lev3')
(I think this will not yet work, but should work)
|
| @NikhilKumarM there is a section in the contributing guide focused on docstrings: https://pandas.pydata.org/pandas-docs/stable/contributing_docstring.html |
pandas/core/reshape/reshape.py Outdated
| index = MultiIndex.from_arrays([index, self[columns]]) | ||
| # added this case to handle multi-index | ||
| elif isinstance(index, list): | ||
| indexes = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this a list-comprehension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure!
pandas/core/reshape/reshape.py Outdated
| if index is None: | ||
| index = self.index | ||
| index = MultiIndex.from_arrays([index, self[columns]]) | ||
| # added this case to handle multi-index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make a comment that reflects what is here (not added)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
doc/source/whatsnew/v0.24.0.txt Outdated
| - :func:`Series.mode` and :func:`DataFrame.mode` now support the ``dropna`` parameter which can be used to specify whether NaN/NaT values should be considered (:issue:`17534`) | ||
| - :func:`to_csv` now supports ``compression`` keyword when a file handle is passed. (:issue:`21227`) | ||
| - :meth:`Index.droplevel` is now implemented also for flat indexes, for compatibility with MultiIndex (:issue:`21115`) | ||
| - :func:'Dataframe.pivot' now supports multiple columns as index. (:issue:'21425') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'columns as an index'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I will make those changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected
| Why is continuous-integration/travis-ci failing? |
@NikhilKumarM : Restarted Travis for you. Failure unrelated to PR. |
jreback left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small change otherwise lgtm. does the doc string need updating?
pandas/core/reshape/reshape.py Outdated
| if index is None: | ||
| index = self.index | ||
| index = MultiIndex.from_arrays([index, self[columns]]) | ||
| elif isinstance(index, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use is_list_like instead here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I will do it.
| @jreback @gfyoung @jorisvandenbossche |
| @NikhilKumarM : Got a linting error on Travis: https://travis-ci.org/pandas-dev/pandas/jobs/398632775#L2904 |
| @gfyoung Fixed the linting error. |
jreback left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add a whatsnew note for 0.24. I guess this is a bug fix, though was this documented before? if so its an enhancement. Can you also add an exampl ein reshape.rst with this. And in the pivot doc-string as well.
| if values is None: | ||
| cols = [columns] if index is None else [index, columns] | ||
| if index is None: | ||
| cols = [columns] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have tests for each one of these paths here? e.g. branching a lot more now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is an enhancement. I have added the test cases and need to update the docs. Thanks for your time.
| also needs to rebase |
| can you merge master |
| Sure! |
jreback left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you merge master & add a whatsnew note.
| cols = [columns] if index is None else [index, columns] | ||
| if index is None: | ||
| cols = [columns] | ||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment explaining what is going on here, ideally we would have a comment for each branch of the if/else
| can you update |
| can you merge master |
| I think pivot function has been removed from reshape.py and put in pivot.py. I will rebase with the master and make necessary changes by this weekend. Thank you !! |
| can you merge master and update |
| Closing as stale - ping if you'd like to reopen and continue working this |
git diff upstream/master -u -- "*.py" | flake8 --diffAdded extra case to handle indexing on multiple colums in dataframe.pivot function