Skip to content

Conversation

@rkc007
Copy link
Contributor

@rkc007 rkc007 commented Nov 23, 2020

@rkc007
Copy link
Contributor Author

rkc007 commented Nov 23, 2020

This is the first time I am contributing to pandas open source project. Please let me know if I can improve anything on this commit.

indexed by a :class:`TimedeltaIndex` with a fixed frequency when x-axis lower limit was greater than upper limit (:issue:`37454`)
- Bug in :meth:`DataFrameGroupBy.boxplot` when ``subplots=False``, a KeyError would raise (:issue:`16748`)
- Bug in :meth:`DataFrame.plot` and :meth:`Series.plot` was overwriting matplotlib's shared y axes behaviour when no sharey parameter was passed (:issue:`37942`)
- Bug in :meth:`DataFrame.plot` to handle nullable integers (:issue:`32073`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to be more specific:

Bug in :meth:`DataFrame.plot` was raising a ``TypeError`` with ``ExtensionDtype`` columns (:issue:`32073`) 
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @arw2019 for your comments. I will add these changes to the next commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arw2019 I added the changes to the latest commit. Can you please review it again ?

Copy link
Contributor

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rkc007 for the PR!

Some comments. Also take a look at the CI (we standardize the use of pd.DataFrame/DataFrame within a file so having both errors)

Let us know if you have any questions!

numeric_data = numeric_data.copy()
for col in numeric_data:
numeric_data[col] = np.asarray(numeric_data[col])
def convert_to_ndarray(data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could use pd.to_numeric here I think. Illustrative example:

In [8]: s = pd.Series([1, 2, 3, pd.NA], dtype="Int64") In [9]: pd.to_numeric(s.to_numpy(), downcast="integer") Out[9]: array([ 1., 2., 3., nan])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @arw2019 for the example. I think I didn't get your point here. Can you please explain it a bit more?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no don't use pd.to_numeric we already know the type and its an internal function

)

_check_plot_works(df.plot, x="A", y="B")
_check_plot_works(df[["A", "B"]].astype("Int64").plot, x="A", y="B")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd rather to specify the dtype in the dataframe construction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Thank you for the suggestion. I will rectify this in the next commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arw2019 I added the changes to the latest commit. Can you please review it again ?

Copy link
Contributor

@ivanovmg ivanovmg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkc007, please see my comments below.

numeric_data = numeric_data.copy()
for col in numeric_data:
numeric_data[col] = np.asarray(numeric_data[col])
def convert_to_ndarray(data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving convert_to_ndarray out of the scope of the function _compute_plot_data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes pls do this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I added the changes to the latest commit. Can you please review it again ?

Comment on lines 156 to 157
"A": [1, 2, 3, 4, 5],
"B": [7, 5, np.nan, 3, 2],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about having a nullable value in column "A" (x-axis)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivanovmg I added the changes to the latest commit. Can you please review it again ?

@jreback jreback added NA - MaskedArrays Related to pd.NA and nullable extension arrays Visualization plotting labels Nov 26, 2020
numeric_data = numeric_data.copy()
for col in numeric_data:
numeric_data[col] = np.asarray(numeric_data[col])
def convert_to_ndarray(data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no don't use pd.to_numeric we already know the type and its an internal function

numeric_data = numeric_data.copy()
for col in numeric_data:
numeric_data[col] = np.asarray(numeric_data[col])
def convert_to_ndarray(data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes pls do this

numeric_data[col] = np.asarray(numeric_data[col])
def convert_to_ndarray(data):
# GH32073: cast to float if values contain nulled integers
if is_integer_dtype(data.dtype) and is_extension_array_dtype(data.dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually we want to also catch / test nullable floats as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I fixed this in the latest commit.

return self.axes[0]

def _convert_to_ndarray(self, data):
# data = self.data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out @ivanovmg.
Deleted the comment in the latest commit.

Copy link
Contributor

@ivanovmg ivanovmg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nitpick about the unnecessary comment.
Otherwise looks good to me.

@jreback jreback added this to the 1.2 milestone Dec 4, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. can you merge master and ping on green.

@rkc007
Copy link
Contributor Author

rkc007 commented Dec 7, 2020

@jreback the branch has the latest code and all the checks passed.

@jreback jreback merged commit 0aa598a into pandas-dev:master Dec 7, 2020
@jreback
Copy link
Contributor

jreback commented Dec 7, 2020

thanks @rkc007

@rkc007 rkc007 deleted the GH32073 branch December 8, 2020 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NA - MaskedArrays Related to pd.NA and nullable extension arrays Visualization plotting

4 participants