-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Categorical fixups #7768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Categorical fixups #7768
Changes from all commits
c2f490e 90a81df f4bf9ee 130b61e 65d9d6e 5c4f1bd 0438a30 19f4d46 47953a2 2958ce1 File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| | @@ -90,6 +90,7 @@ By using some special functions: | |
| df['group'] = pd.cut(df.value, range(0, 105, 10), right=False, labels=labels) | ||
| df.head(10) | ||
| | ||
| See :ref:`documentation <reshaping.tile.cut>` for :func:`~pandas.cut`. | ||
| | ||
| `Categoricals` have a specific ``category`` :ref:`dtype <basics.dtypes>`: | ||
| | ||
| | @@ -331,6 +332,45 @@ Operations | |
| | ||
| The following operations are possible with categorical data: | ||
| | ||
| Comparing `Categoricals` with other objects is possible in two cases: | ||
| * comparing a `Categorical` to another `Categorical`, when `level` and `ordered` is the same or | ||
| * comparing a `Categorical` to a scalar. | ||
| All other comparisons will raise a TypeError. | ||
| | ||
| .. ipython:: python | ||
| | ||
| cat = pd.Series(pd.Categorical([1,2,3], levels=[3,2,1])) | ||
| cat_base = pd.Series(pd.Categorical([2,2,2], levels=[3,2,1])) | ||
| cat_base2 = pd.Series(pd.Categorical([2,2,2])) | ||
| | ||
| cat > cat_base | ||
| | ||
| # This doesn't work because the levels are not the same | ||
| try: | ||
| Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. their is a way to do this in the docs (showing an exception); can also do a code block Contributor Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I haven't found a way to do that. Just letting the exception happen results in long stacktraces and I don't like codeblocks, where the exception message has to be manually inserted (and maintained). Maybe that would be a nice PR for the ipython directive.... Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep that's fine (I bet their is a way with Contributor Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I looked at the sphinx extension source and don't think there is a way without modifying it. `:okexcept:' basically only prevents sphinx to write the exception to stdout. A Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe can create a small function and put in utils for this purpose (basically what u r doing) Contributor Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. something like | ||
| cat > cat_base2 | ||
| except TypeError as e: | ||
| print("TypeError: " + str(e)) | ||
| | ||
| cat > 2 | ||
| Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. put a comment above (eg comparison vs scalar) Contributor Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done | ||
| | ||
| .. note:: | ||
| | ||
| Comparisons with `Series`, `np.array` or a `Categorical` with different levels or ordering | ||
| will raise an `TypeError` because custom level ordering would result in two valid results: | ||
| one with taking in account the ordering and one without. If you want to compare a `Categorical` | ||
| with such a type, you need to be explicit and convert the `Categorical` to values: | ||
| | ||
| .. ipython:: python | ||
| | ||
| base = np.array([1,2,3]) | ||
| | ||
| try: | ||
| cat > base | ||
| except TypeError as e: | ||
| print("TypeError: " + str(e)) | ||
| | ||
| np.asarray(cat) > base | ||
| | ||
| Getting the minimum and maximum, if the categorical is ordered: | ||
| | ||
| .. ipython:: python | ||
| | @@ -509,7 +549,8 @@ The same applies to ``df.append(df)``. | |
| Getting Data In/Out | ||
| ------------------- | ||
| | ||
| Writing data (`Series`, `Frames`) to a HDF store that contains a ``category`` dtype will currently raise ``NotImplementedError``. | ||
| Writing data (`Series`, `Frames`) to a HDF store that contains a ``category`` dtype will currently | ||
| raise ``NotImplementedError``. | ||
| | ||
| Writing to a CSV file will convert the data, effectively removing any information about the | ||
| `Categorical` (levels and ordering). So if you read back the CSV file you have to convert the | ||
| | @@ -579,7 +620,7 @@ object and not as a low level `numpy` array dtype. This leads to some problems. | |
| try: | ||
| np.dtype("category") | ||
| except TypeError as e: | ||
| print("TypeError: " + str(e)) | ||
| print("TypeError: " + str(e)) | ||
| | ||
| dtype = pd.Categorical(["a"]).dtype | ||
| try: | ||
| | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
show the cats after they are created
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done