Skip to content
Prev Previous commit
Next Next commit
TYP:Overload pandas.core.frame.dropna function signature
As the `Optional[DataFrame]` return type creates a union of incompatible types (DataFrame and None), we overload the signature on the value of the `inplace` parameter, to force the return of a DataFrame when inplace is false. Fixes GH38948
  • Loading branch information
rbpatt2019 committed Jan 5, 2021
commit f2e2bcf8be84b3b97e47ff501af554826ae58428
22 changes: 22 additions & 0 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -5069,6 +5069,28 @@ def notna(self) -> DataFrame:
@doc(NDFrame.notna, klass=_shared_doc_kwargs["klass"])
def notnull(self) -> DataFrame:
return ~self.isna()

# Overload function signature to prevent union of conflicting types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd drop this comment

# As Optional[DataFrame] is really Union[DataFrame, None]
@overload
def dropna(
self,
axis: Axis = ...,
how: str = ...,
thresh: Optional[int] = ...,
subset: Optional[Union[Hashable, Sequence[Hashable]]] = ...,
inplace: Literal[False] = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should you be specifying the default value here? (I'm not sure)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we were typing at the time :P My understanding of overloading (and this might be very wrong) is that you only specify the defaults in overloads for those parameters you are overloading, in this case inplace.

See below for an example taken from pandas.core.frame.reset_index.

pandas/pandas/core/frame.py

Lines 4812 to 4843 in 67d4cae

@overload
# https://github.com/python/mypy/issues/6580
# Overloaded function signatures 1 and 2 overlap with incompatible return types
def reset_index( # type: ignore[misc]
self,
level: Optional[Union[Hashable, Sequence[Hashable]]] = ...,
drop: bool = ...,
inplace: Literal[False] = ...,
col_level: Hashable = ...,
col_fill: Label = ...,
) -> DataFrame:
...
@overload
def reset_index(
self,
level: Optional[Union[Hashable, Sequence[Hashable]]] = ...,
drop: bool = ...,
inplace: Literal[True] = ...,
col_level: Hashable = ...,
col_fill: Label = ...,
) -> None:
...
def reset_index(
self,
level: Optional[Union[Hashable, Sequence[Hashable]]] = None,
drop: bool = False,
inplace: bool = False,
col_level: Hashable = 0,
col_fill: Label = "",
) -> Optional[DataFrame]:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah in this example the defaults are not specified in the overloaded signatures so I think you should get rid of those

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That example also explains the mypy error (it's caused to a mypy bug). So to get mypy passing you need to add a type ignore (and a comment with the link to mypy issue and the error message above the overloaded signature as done in the reset_index example)

Those things being done I think this will be good to go in

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha! Changes forthcoming...

) -> DataFrame: ...

@overload
def dropna(
self,
axis: Axis = ...,
how: str = ...,
thresh: Optional[int] = ...,
subset: Optional[Union[Hashable, Sequence[Hashable]]] = ...,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we now have an IndexLabel alias in pandas._typing that could be used for subset.

Suggested change
subset: Optional[Union[Hashable, Sequence[Hashable]]] = ...,
subset: Optional[IndexLabel] = ...,
inplace: Literal[True] = True,
) -> None: ...

def dropna(
self,
Expand Down