Skip to content

Commit 69edd6d

Browse files
committed
Merge branch 'master' into bug-union_many
2 parents 1566a95 + f1b255d commit 69edd6d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+422
-291
lines changed

.github/CONTRIBUTING.md

Lines changed: 1 addition & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,3 @@
11
# Contributing to pandas
22

3-
Whether you are a novice or experienced software developer, all contributions and suggestions are welcome!
4-
5-
Our main contributing guide can be found [in this repo](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst) or [on the website](https://pandas.pydata.org/docs/dev/development/contributing.html). If you do not want to read it in its entirety, we will summarize the main ways in which you can contribute and point to relevant sections of that document for further information.
6-
7-
## Getting Started
8-
9-
If you are looking to contribute to the *pandas* codebase, the best place to start is the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues). This is also a great place for filing bug reports and making suggestions for ways in which we can improve the code and documentation.
10-
11-
If you have additional questions, feel free to ask them on the [mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata) or on [Gitter](https://gitter.im/pydata/pandas). Further information can also be found in the "[Where to start?](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#where-to-start)" section.
12-
13-
## Filing Issues
14-
15-
If you notice a bug in the code or documentation, or have suggestions for how we can improve either, feel free to create an issue on the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) using [GitHub's "issue" form](https://github.com/pandas-dev/pandas/issues/new). The form contains some questions that will help us best address your issue. For more information regarding how to file issues against *pandas*, please refer to the "[Bug reports and enhancement requests](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#bug-reports-and-enhancement-requests)" section.
16-
17-
## Contributing to the Codebase
18-
19-
The code is hosted on [GitHub](https://www.github.com/pandas-dev/pandas), so you will need to use [Git](https://git-scm.com/) to clone the project and make changes to the codebase. Once you have obtained a copy of the code, you should create a development environment that is separate from your existing Python environment so that you can make and test changes without compromising your own work environment. For more information, please refer to the "[Working with the code](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#working-with-the-code)" section.
20-
21-
Before submitting your changes for review, make sure to check that your changes do not break any tests. You can find more information about our test suites in the "[Test-driven development/code writing](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#test-driven-development-code-writing)" section. We also have guidelines regarding coding style that will be enforced during testing, which can be found in the "[Code standards](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#code-standards)" section.
22-
23-
Once your changes are ready to be submitted, make sure to push your changes to GitHub before creating a pull request. Details about how to do that can be found in the "[Contributing your changes to pandas](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#contributing-your-changes-to-pandas)" section. We will review your changes, and you will most likely be asked to make additional changes before it is finally ready to merge. However, once it's ready, we will merge it, and you will have successfully contributed to the codebase!
3+
A detailed overview on how to contribute can be found in the **[contributing guide](https://pandas.pydata.org/docs/dev/development/contributing.html)**.

.github/workflows/database.yml

Lines changed: 0 additions & 123 deletions
This file was deleted.

.github/workflows/posix.yml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ jobs:
2525
strategy:
2626
matrix:
2727
settings: [
28+
[actions-38-db-min.yaml, "((not slow and not network and not clipboard) or (single and db))", "", "", "", "", ""],
29+
[actions-38-db.yaml, "((not slow and not network and not clipboard) or (single and db))", "", "", "", "", ""],
2830
[actions-38-minimum_versions.yaml, "not slow and not network and not clipboard", "", "", "", "", ""],
2931
[actions-38-locale_slow.yaml, "slow", "language-pack-it xsel", "it_IT.utf8", "it_IT.utf8", "", ""],
3032
[actions-38.yaml, "not slow and not clipboard", "", "", "", "", ""],
@@ -52,7 +54,35 @@ jobs:
5254
# https://github.community/t/concurrecy-not-work-for-push/183068/7
5355
group: ${{ github.event_name == 'push' && github.run_number || github.ref }}-${{ matrix.settings[0] }}
5456
cancel-in-progress: true
57+
5558
services:
59+
mysql:
60+
image: mysql
61+
env:
62+
MYSQL_ALLOW_EMPTY_PASSWORD: yes
63+
MYSQL_DATABASE: pandas
64+
options: >-
65+
--health-cmd "mysqladmin ping"
66+
--health-interval 10s
67+
--health-timeout 5s
68+
--health-retries 5
69+
ports:
70+
- 3306:3306
71+
72+
postgres:
73+
image: postgres
74+
env:
75+
POSTGRES_USER: postgres
76+
POSTGRES_PASSWORD: postgres
77+
POSTGRES_DB: pandas
78+
options: >-
79+
--health-cmd pg_isready
80+
--health-interval 10s
81+
--health-timeout 5s
82+
--health-retries 5
83+
ports:
84+
- 5432:5432
85+
5686
moto:
5787
image: motoserver/moto
5888
env:

.pre-commit-config.yaml

Lines changed: 4 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -35,25 +35,15 @@ repos:
3535
# we can lint all header files since they aren't "generated" like C files are.
3636
exclude: ^pandas/_libs/src/(klib|headers)/
3737
args: [--quiet, '--extensions=c,h', '--headers=h', --recursive, '--filter=-readability/casting,-runtime/int,-build/include_subdir']
38-
- repo: https://gitlab.com/pycqa/flake8
39-
rev: 3.9.2
38+
- repo: https://github.com/PyCQA/flake8
39+
rev: 4.0.1
4040
hooks:
4141
- id: flake8
4242
additional_dependencies: &flake8_dependencies
43-
- flake8==3.9.2
44-
- flake8-comprehensions==3.1.0
43+
- flake8==4.0.1
44+
- flake8-comprehensions==3.7.0
4545
- flake8-bugbear==21.3.2
4646
- pandas-dev-flaker==0.2.0
47-
- id: flake8
48-
alias: flake8-cython
49-
name: flake8 (cython)
50-
types: [cython]
51-
args: [--append-config=flake8/cython.cfg]
52-
- id: flake8
53-
name: flake8 (cython template)
54-
files: \.pxi\.in$
55-
types: [text]
56-
args: [--append-config=flake8/cython-template.cfg]
5747
- repo: https://github.com/PyCQA/isort
5848
rev: 5.10.1
5949
hooks:

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,7 @@ Most development discussions take place on GitHub in this repo. Further, the [pa
160160

161161
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.
162162

163-
A detailed overview on how to contribute can be found in the **[contributing guide](https://pandas.pydata.org/docs/dev/development/contributing.html)**. There is also an [overview](.github/CONTRIBUTING.md) on GitHub.
163+
A detailed overview on how to contribute can be found in the **[contributing guide](https://pandas.pydata.org/docs/dev/development/contributing.html)**.
164164

165165
If you are simply looking to start working with the pandas codebase, navigate to the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) and start looking through interesting issues. There are a number of issues listed under [Docs](https://github.com/pandas-dev/pandas/issues?labels=Docs&sort=updated&state=open) and [good first issue](https://github.com/pandas-dev/pandas/issues?labels=good+first+issue&sort=updated&state=open) where you could start out.
166166

asv_bench/benchmarks/sparse.py

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -198,7 +198,8 @@ def time_take(self, indices, allow_fill):
198198
class GetItem:
199199
def setup(self):
200200
N = 1_000_000
201-
arr = make_array(N, 1e-5, np.nan, np.float64)
201+
d = 1e-5
202+
arr = make_array(N, d, np.nan, np.float64)
202203
self.sp_arr = SparseArray(arr)
203204

204205
def time_integer_indexing(self):
@@ -208,4 +209,25 @@ def time_slice(self):
208209
self.sp_arr[1:]
209210

210211

212+
class GetItemMask:
213+
214+
params = [True, False, np.nan]
215+
param_names = ["fill_value"]
216+
217+
def setup(self, fill_value):
218+
N = 1_000_000
219+
d = 1e-5
220+
arr = make_array(N, d, np.nan, np.float64)
221+
self.sp_arr = SparseArray(arr)
222+
b_arr = np.full(shape=N, fill_value=fill_value, dtype=np.bool8)
223+
fv_inds = np.unique(
224+
np.random.randint(low=0, high=N - 1, size=int(N * d), dtype=np.int32)
225+
)
226+
b_arr[fv_inds] = True if pd.isna(fill_value) else not fill_value
227+
self.sp_b_arr = SparseArray(b_arr, dtype=np.bool8, fill_value=fill_value)
228+
229+
def time_mask(self, fill_value):
230+
self.sp_arr[self.sp_b_arr]
231+
232+
211233
from .pandas_vb_common import setup # noqa: F401 isort:skip

ci/code_checks.sh

Lines changed: 2 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -65,23 +65,8 @@ fi
6565
if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
6666

6767
MSG='Doctests' ; echo $MSG
68-
python -m pytest --doctest-modules \
69-
pandas/_config/ \
70-
pandas/_libs/ \
71-
pandas/_testing/ \
72-
pandas/api/ \
73-
pandas/arrays/ \
74-
pandas/compat/ \
75-
pandas/core \
76-
pandas/errors/ \
77-
pandas/io/ \
78-
pandas/plotting/ \
79-
pandas/tseries/ \
80-
pandas/util/ \
81-
pandas/_typing.py \
82-
pandas/_version.py \
83-
pandas/conftest.py \
84-
pandas/testing.py
68+
# Ignore test_*.py files or else the unit tests will run
69+
python -m pytest --doctest-modules --ignore-glob="**/test_*.py" pandas
8570
RET=$(($RET + $?)) ; echo $MSG "DONE"
8671

8772
MSG='Cython Doctests' ; echo $MSG

doc/source/whatsnew/v0.18.0.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -669,9 +669,9 @@ New signature
669669

670670
.. ipython:: python
671671
672-
pd.Series([0,1]).rank(axis=0, method='average', numeric_only=None,
672+
pd.Series([0,1]).rank(axis=0, method='average', numeric_only=False,
673673
na_option='keep', ascending=True, pct=False)
674-
pd.DataFrame([0,1]).rank(axis=0, method='average', numeric_only=None,
674+
pd.DataFrame([0,1]).rank(axis=0, method='average', numeric_only=False,
675675
na_option='keep', ascending=True, pct=False)
676676
677677

doc/source/whatsnew/v1.4.0.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -548,6 +548,8 @@ Other Deprecations
548548
- Deprecated :meth:`Index.__getitem__` with a bool key; use ``index.values[key]`` to get the old behavior (:issue:`44051`)
549549
- Deprecated downcasting column-by-column in :meth:`DataFrame.where` with integer-dtypes (:issue:`44597`)
550550
- Deprecated :meth:`DatetimeIndex.union_many`, use :meth:`DatetimeIndex.union` instead (:issue:`44091`)
551+
- Deprecated ``numeric_only=None`` in :meth:`DataFrame.rank`; in a future version ``numeric_only`` must be either ``True`` or ``False`` (the default) (:issue:`45036`)
552+
- Deprecated :meth:`NaT.freq` (:issue:`45071`)
551553
-
552554

553555
.. ---------------------------------------------------------------------------
@@ -603,6 +605,7 @@ Performance improvements
603605
- Performance improvement in :func:`to_csv` when :class:`MultiIndex` contains a lot of unused levels (:issue:`37484`)
604606
- Performance improvement in :func:`read_csv` when ``index_col`` was set with a numeric column (:issue:`44158`)
605607
- Performance improvement in :func:`concat` (:issue:`43354`)
608+
- Performance improvement in :meth:`SparseArray.__getitem__` (:issue:`23122`)
606609
- Performance improvement in constructing a :class:`DataFrame` from array-like objects like a ``Pytorch`` tensor (:issue:`44616`)
607610
-
608611

@@ -734,6 +737,8 @@ Missing
734737
- Bug in :meth:`Series.interpolate` and :meth:`DataFrame.interpolate` with ``inplace=True`` not writing to the underlying array(s) in-place (:issue:`44749`)
735738
- Bug in :meth:`Index.fillna` incorrectly returning an un-filled :class:`Index` when NA values are present and ``downcast`` argument is specified. This now raises ``NotImplementedError`` instead; do not pass ``downcast`` argument (:issue:`44873`)
736739
- Bug in :meth:`DataFrame.dropna` changing :class:`Index` even if no entries were dropped (:issue:`41965`)
740+
- Bug in :meth:`Series.fillna` with an object-dtype incorrectly ignoring ``downcast="infer"`` (:issue:`44241`)
741+
-
737742

738743
MultiIndex
739744
^^^^^^^^^^
@@ -850,6 +855,7 @@ Sparse
850855
- Bug in :meth:`DataFrame.sparse.to_coo` silently converting non-zero fill values to zero (:issue:`24817`)
851856
- Bug in :class:`SparseArray` comparison methods with an array-like operand of mismatched length raising ``AssertionError`` or unclear ``ValueError`` depending on the input (:issue:`43863`)
852857
- Bug in :class:`SparseArray` arithmetic methods ``floordiv`` and ``mod`` behaviors when dividing by zero not matching the non-sparse :class:`Series` behavior (:issue:`38172`)
858+
- Bug in :class:`SparseArray` unary methods as well as :meth:`SparseArray.isna` doesn't recalculate indexes (:issue:`44955`)
853859
-
854860

855861
ExtensionArray

environment.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,9 @@ dependencies:
2020
# code checks
2121
- black=21.5b2
2222
- cpplint
23-
- flake8=3.9.2
23+
- flake8=4.0.1
2424
- flake8-bugbear=21.3.2 # used by flake8, find likely bugs
25-
- flake8-comprehensions=3.1.0 # used by flake8, linting of unnecessary comprehensions
25+
- flake8-comprehensions=3.7.0 # used by flake8, linting of unnecessary comprehensions
2626
- isort>=5.2.1 # check that imports are in the right order
2727
- mypy=0.920
2828
- pre-commit>=2.9.2

0 commit comments

Comments
 (0)