pandas-dev
diff --git a/‎.github/CONTRIBUTING.md‎
Lines changed: 1 addition & 21 deletions b/‎.github/CONTRIBUTING.md‎
Lines changed: 1 addition & 21 deletions
diff --git a/‎.github/workflows/database.yml‎
Lines changed: 0 additions & 123 deletions b/‎.github/workflows/database.yml‎
Lines changed: 0 additions & 123 deletions
diff --git a/‎.github/workflows/posix.yml‎
Lines changed: 30 additions & 0 deletions b/‎.github/workflows/posix.yml‎
Lines changed: 30 additions & 0 deletions
diff --git a/‎.pre-commit-config.yaml‎
Lines changed: 4 additions & 14 deletions b/‎.pre-commit-config.yaml‎
Lines changed: 4 additions & 14 deletions
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎asv_bench/benchmarks/sparse.py‎
Lines changed: 23 additions & 1 deletion b/‎asv_bench/benchmarks/sparse.py‎
Lines changed: 23 additions & 1 deletion
diff --git a/‎ci/code_checks.sh‎
Lines changed: 2 additions & 17 deletions b/‎ci/code_checks.sh‎
Lines changed: 2 additions & 17 deletions
diff --git a/‎doc/source/whatsnew/v0.18.0.rst‎
Lines changed: 2 additions & 2 deletions b/‎doc/source/whatsnew/v0.18.0.rst‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎doc/source/whatsnew/v1.4.0.rst‎
Lines changed: 6 additions & 0 deletions b/‎doc/source/whatsnew/v1.4.0.rst‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎environment.yml‎
Lines changed: 2 additions & 2 deletions b/‎environment.yml‎
Lines changed: 2 additions & 2 deletions
@@ -1,23 +1,3 @@
 # Contributing to pandas
 
-Whether you are a novice or experienced software developer, all contributions and suggestions are welcome!
-
-Our main contributing guide can be found [in this repo](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst) or [on the website](https://pandas.pydata.org/docs/dev/development/contributing.html). If you do not want to read it in its entirety, we will summarize the main ways in which you can contribute and point to relevant sections of that document for further information.
-
-## Getting Started
-
-If you are looking to contribute to the *pandas* codebase, the best place to start is the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues). This is also a great place for filing bug reports and making suggestions for ways in which we can improve the code and documentation.
-
-If you have additional questions, feel free to ask them on the [mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata) or on [Gitter](https://gitter.im/pydata/pandas). Further information can also be found in the "[Where to start?](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#where-to-start)" section.
-
-## Filing Issues
-
-If you notice a bug in the code or documentation, or have suggestions for how we can improve either, feel free to create an issue on the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) using [GitHub's "issue" form](https://github.com/pandas-dev/pandas/issues/new). The form contains some questions that will help us best address your issue. For more information regarding how to file issues against *pandas*, please refer to the "[Bug reports and enhancement requests](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#bug-reports-and-enhancement-requests)" section.
-
-## Contributing to the Codebase
-
-The code is hosted on [GitHub](https://www.github.com/pandas-dev/pandas), so you will need to use [Git](https://git-scm.com/) to clone the project and make changes to the codebase. Once you have obtained a copy of the code, you should create a development environment that is separate from your existing Python environment so that you can make and test changes without compromising your own work environment. For more information, please refer to the "[Working with the code](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#working-with-the-code)" section.
-
-Before submitting your changes for review, make sure to check that your changes do not break any tests. You can find more information about our test suites in the "[Test-driven development/code writing](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#test-driven-development-code-writing)" section. We also have guidelines regarding coding style that will be enforced during testing, which can be found in the "[Code standards](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#code-standards)" section.
-
-Once your changes are ready to be submitted, make sure to push your changes to GitHub before creating a pull request. Details about how to do that can be found in the "[Contributing your changes to pandas](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#contributing-your-changes-to-pandas)" section. We will review your changes, and you will most likely be asked to make additional changes before it is finally ready to merge. However, once it's ready, we will merge it, and you will have successfully contributed to the codebase!
+A detailed overview on how to contribute can be found in the **[contributing guide](https://pandas.pydata.org/docs/dev/development/contributing.html)**.
@@ -25,6 +25,8 @@ jobs:
  strategy:
  matrix:
  settings: [
+ [actions-38-db-min.yaml, "((not slow and not network and not clipboard) or (single and db))", "", "", "", "", ""],
+ [actions-38-db.yaml, "((not slow and not network and not clipboard) or (single and db))", "", "", "", "", ""],
  [actions-38-minimum_versions.yaml, "not slow and not network and not clipboard", "", "", "", "", ""],
  [actions-38-locale_slow.yaml, "slow", "language-pack-it xsel", "it_IT.utf8", "it_IT.utf8", "", ""],
  [actions-38.yaml, "not slow and not clipboard", "", "", "", "", ""],
@@ -52,7 +54,35 @@ jobs:
  # https://github.community/t/concurrecy-not-work-for-push/183068/7
  group: ${{ github.event_name == 'push' && github.run_number || github.ref }}-${{ matrix.settings[0] }}
  cancel-in-progress: true
+
  services:
+ mysql:
+ image: mysql
+ env:
+ MYSQL_ALLOW_EMPTY_PASSWORD: yes
+ MYSQL_DATABASE: pandas
+ options: >-
+ --health-cmd "mysqladmin ping"
+ --health-interval 10s
+ --health-timeout 5s
+ --health-retries 5
+ ports:
+ - 3306:3306
+
+ postgres:
+ image: postgres
+ env:
+ POSTGRES_USER: postgres
+ POSTGRES_PASSWORD: postgres
+ POSTGRES_DB: pandas
+ options: >-
+ --health-cmd pg_isready
+ --health-interval 10s
+ --health-timeout 5s
+ --health-retries 5
+ ports:
+ - 5432:5432
+
  moto:
  image: motoserver/moto
  env:
 
@@ -35,25 +35,15 @@ repos:
  # we can lint all header files since they aren't "generated" like C files are.
  exclude: ^pandas/_libs/src/(klib|headers)/
  args: [--quiet, '--extensions=c,h', '--headers=h', --recursive, '--filter=-readability/casting,-runtime/int,-build/include_subdir']
-- repo: https://gitlab.com/pycqa/flake8
- rev: 3.9.2
+- repo: https://github.com/PyCQA/flake8
+ rev: 4.0.1
  hooks:
  - id: flake8
  additional_dependencies: &flake8_dependencies
- - flake8==3.9.2
- - flake8-comprehensions==3.1.0
+ - flake8==4.0.1
+ - flake8-comprehensions==3.7.0
  - flake8-bugbear==21.3.2
  - pandas-dev-flaker==0.2.0
- - id: flake8
- alias: flake8-cython
- name: flake8 (cython)
- types: [cython]
- args: [--append-config=flake8/cython.cfg]
- - id: flake8
- name: flake8 (cython template)
- files: \.pxi\.in$
- types: [text]
- args: [--append-config=flake8/cython-template.cfg]
 - repo: https://github.com/PyCQA/isort
  rev: 5.10.1
  hooks:
 
@@ -160,7 +160,7 @@ Most development discussions take place on GitHub in this repo. Further, the [pa
 
 All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.
 
-A detailed overview on how to contribute can be found in the **[contributing guide](https://pandas.pydata.org/docs/dev/development/contributing.html)**. There is also an [overview](.github/CONTRIBUTING.md) on GitHub.
+A detailed overview on how to contribute can be found in the **[contributing guide](https://pandas.pydata.org/docs/dev/development/contributing.html)**.
 
 If you are simply looking to start working with the pandas codebase, navigate to the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) and start looking through interesting issues. There are a number of issues listed under [Docs](https://github.com/pandas-dev/pandas/issues?labels=Docs&sort=updated&state=open) and [good first issue](https://github.com/pandas-dev/pandas/issues?labels=good+first+issue&sort=updated&state=open) where you could start out.
 
 
@@ -198,7 +198,8 @@ def time_take(self, indices, allow_fill):
 class GetItem:
  def setup(self):
  N = 1_000_000
- arr = make_array(N, 1e-5, np.nan, np.float64)
+ d = 1e-5
+ arr = make_array(N, d, np.nan, np.float64)
  self.sp_arr = SparseArray(arr)
 
  def time_integer_indexing(self):
@@ -208,4 +209,25 @@ def time_slice(self):
  self.sp_arr[1:]
 
 
+class GetItemMask:
+
+ params = [True, False, np.nan]
+ param_names = ["fill_value"]
+
+ def setup(self, fill_value):
+ N = 1_000_000
+ d = 1e-5
+ arr = make_array(N, d, np.nan, np.float64)
+ self.sp_arr = SparseArray(arr)
+ b_arr = np.full(shape=N, fill_value=fill_value, dtype=np.bool8)
+ fv_inds = np.unique(
+ np.random.randint(low=0, high=N - 1, size=int(N * d), dtype=np.int32)
+ )
+ b_arr[fv_inds] = True if pd.isna(fill_value) else not fill_value
+ self.sp_b_arr = SparseArray(b_arr, dtype=np.bool8, fill_value=fill_value)
+
+ def time_mask(self, fill_value):
+ self.sp_arr[self.sp_b_arr]
+
+
 from .pandas_vb_common import setup # noqa: F401 isort:skip
@@ -65,23 +65,8 @@ fi
 if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
 
  MSG='Doctests' ; echo $MSG
- python -m pytest --doctest-modules \
- pandas/_config/ \
- pandas/_libs/ \
- pandas/_testing/ \
- pandas/api/ \
- pandas/arrays/ \
- pandas/compat/ \
- pandas/core \
- pandas/errors/ \
- pandas/io/ \
- pandas/plotting/ \
- pandas/tseries/ \
- pandas/util/ \
- pandas/_typing.py \
- pandas/_version.py \
- pandas/conftest.py \
- pandas/testing.py
+ # Ignore test_*.py files or else the unit tests will run
+ python -m pytest --doctest-modules --ignore-glob="**/test_*.py" pandas
  RET=$(($RET + $?)) ; echo $MSG "DONE"
 
  MSG='Cython Doctests' ; echo $MSG
 
@@ -669,9 +669,9 @@ New signature
 
 .. ipython:: python
 
- pd.Series([0,1]).rank(axis=0, method='average', numeric_only=None,
+ pd.Series([0,1]).rank(axis=0, method='average', numeric_only=False,
  na_option='keep', ascending=True, pct=False)
- pd.DataFrame([0,1]).rank(axis=0, method='average', numeric_only=None,
+ pd.DataFrame([0,1]).rank(axis=0, method='average', numeric_only=False,
  na_option='keep', ascending=True, pct=False)
 
 
 
@@ -548,6 +548,8 @@ Other Deprecations
 - Deprecated :meth:`Index.__getitem__` with a bool key; use ``index.values[key]`` to get the old behavior (:issue:`44051`)
 - Deprecated downcasting column-by-column in :meth:`DataFrame.where` with integer-dtypes (:issue:`44597`)
 - Deprecated :meth:`DatetimeIndex.union_many`, use :meth:`DatetimeIndex.union` instead (:issue:`44091`)
+- Deprecated ``numeric_only=None`` in :meth:`DataFrame.rank`; in a future version ``numeric_only`` must be either ``True`` or ``False`` (the default) (:issue:`45036`)
+- Deprecated :meth:`NaT.freq` (:issue:`45071`)
 -
 
 .. ---------------------------------------------------------------------------
@@ -603,6 +605,7 @@ Performance improvements
 - Performance improvement in :func:`to_csv` when :class:`MultiIndex` contains a lot of unused levels (:issue:`37484`)
 - Performance improvement in :func:`read_csv` when ``index_col`` was set with a numeric column (:issue:`44158`)
 - Performance improvement in :func:`concat` (:issue:`43354`)
+- Performance improvement in :meth:`SparseArray.__getitem__` (:issue:`23122`)
 - Performance improvement in constructing a :class:`DataFrame` from array-like objects like a ``Pytorch`` tensor (:issue:`44616`)
 -
 
@@ -734,6 +737,8 @@ Missing
 - Bug in :meth:`Series.interpolate` and :meth:`DataFrame.interpolate` with ``inplace=True`` not writing to the underlying array(s) in-place (:issue:`44749`)
 - Bug in :meth:`Index.fillna` incorrectly returning an un-filled :class:`Index` when NA values are present and ``downcast`` argument is specified. This now raises ``NotImplementedError`` instead; do not pass ``downcast`` argument (:issue:`44873`)
 - Bug in :meth:`DataFrame.dropna` changing :class:`Index` even if no entries were dropped (:issue:`41965`)
+- Bug in :meth:`Series.fillna` with an object-dtype incorrectly ignoring ``downcast="infer"`` (:issue:`44241`)
+-
 
 MultiIndex
 ^^^^^^^^^^
@@ -850,6 +855,7 @@ Sparse
 - Bug in :meth:`DataFrame.sparse.to_coo` silently converting non-zero fill values to zero (:issue:`24817`)
 - Bug in :class:`SparseArray` comparison methods with an array-like operand of mismatched length raising ``AssertionError`` or unclear ``ValueError`` depending on the input (:issue:`43863`)
 - Bug in :class:`SparseArray` arithmetic methods ``floordiv`` and ``mod`` behaviors when dividing by zero not matching the non-sparse :class:`Series` behavior (:issue:`38172`)
+- Bug in :class:`SparseArray` unary methods as well as :meth:`SparseArray.isna` doesn't recalculate indexes (:issue:`44955`)
 -
 
 ExtensionArray
 
@@ -20,9 +20,9 @@ dependencies:
  # code checks
  - black=21.5b2
  - cpplint
- - flake8=3.9.2
+ - flake8=4.0.1
  - flake8-bugbear=21.3.2 # used by flake8, find likely bugs
- - flake8-comprehensions=3.1.0 # used by flake8, linting of unnecessary comprehensions
+ - flake8-comprehensions=3.7.0 # used by flake8, linting of unnecessary comprehensions
  - isort>=5.2.1 # check that imports are in the right order
  - mypy=0.920
  - pre-commit>=2.9.2