Skip to content

Commit 55d6ed1

Browse files
committed
Merge branch 'master' into bug-union_many
2 parents 69edd6d + c3d3357 commit 55d6ed1

File tree

97 files changed

+1401
-812
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

97 files changed

+1401
-812
lines changed

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ jobs:
5555

5656
- name: Install pyright
5757
# note: keep version in sync with .pre-commit-config.yaml
58-
run: npm install -g pyright@1.1.171
58+
run: npm install -g pyright@1.1.200
5959

6060
- name: Build Pandas
6161
uses: ./.github/actions/build_pandas

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ repos:
7878
types: [python]
7979
stages: [manual]
8080
# note: keep version in sync with .github/workflows/ci.yml
81-
additional_dependencies: ['pyright@1.1.171']
81+
additional_dependencies: ['pyright@1.1.200']
8282
- repo: local
8383
hooks:
8484
- id: flake8-rst

doc/source/getting_started/comparison/comparison_with_spreadsheets.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -435,13 +435,14 @@ The equivalent in pandas:
435435
Adding a row
436436
~~~~~~~~~~~~
437437

438-
Assuming we are using a :class:`~pandas.RangeIndex` (numbered ``0``, ``1``, etc.), we can use :meth:`DataFrame.append` to add a row to the bottom of a ``DataFrame``.
438+
Assuming we are using a :class:`~pandas.RangeIndex` (numbered ``0``, ``1``, etc.), we can use :func:`concat` to add a row to the bottom of a ``DataFrame``.
439439

440440
.. ipython:: python
441441
442442
df
443-
new_row = {"class": "E", "student_count": 51, "all_pass": True}
444-
df.append(new_row, ignore_index=True)
443+
new_row = pd.DataFrame([["E", 51, True]],
444+
columns=["class", "student_count", "all_pass"])
445+
pd.concat([df, new_row])
445446
446447
447448
Find and Replace

doc/source/user_guide/10min.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -478,7 +478,6 @@ Concatenating pandas objects together with :func:`concat`:
478478
a row requires a copy, and may be expensive. We recommend passing a
479479
pre-built list of records to the :class:`DataFrame` constructor instead
480480
of building a :class:`DataFrame` by iteratively appending records to it.
481-
See :ref:`Appending to dataframe <merging.concatenation>` for more.
482481

483482
Join
484483
~~~~

doc/source/user_guide/cookbook.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -929,9 +929,9 @@ Valid frequency arguments to Grouper :ref:`Timeseries <timeseries.offset_aliases
929929
Merge
930930
-----
931931

932-
The :ref:`Concat <merging.concatenation>` docs. The :ref:`Join <merging.join>` docs.
932+
The :ref:`Join <merging.join>` docs.
933933

934-
`Append two dataframes with overlapping index (emulate R rbind)
934+
`Concatenate two dataframes with overlapping index (emulate R rbind)
935935
<https://stackoverflow.com/questions/14988480/pandas-version-of-rbind>`__
936936

937937
.. ipython:: python
@@ -944,7 +944,7 @@ Depending on df construction, ``ignore_index`` may be needed
944944

945945
.. ipython:: python
946946
947-
df = df1.append(df2, ignore_index=True)
947+
df = pd.concat([df1, df2], ignore_index=True)
948948
df
949949
950950
`Self Join of a DataFrame

doc/source/user_guide/merging.rst

Lines changed: 3 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -237,59 +237,6 @@ Similarly, we could index before the concatenation:
237237
p.plot([df1, df4], result, labels=["df1", "df4"], vertical=False);
238238
plt.close("all");
239239
240-
.. _merging.concatenation:
241-
242-
Concatenating using ``append``
243-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
244-
245-
A useful shortcut to :func:`~pandas.concat` are the :meth:`~DataFrame.append`
246-
instance methods on ``Series`` and ``DataFrame``. These methods actually predated
247-
``concat``. They concatenate along ``axis=0``, namely the index:
248-
249-
.. ipython:: python
250-
251-
result = df1.append(df2)
252-
253-
.. ipython:: python
254-
:suppress:
255-
256-
@savefig merging_append1.png
257-
p.plot([df1, df2], result, labels=["df1", "df2"], vertical=True);
258-
plt.close("all");
259-
260-
In the case of ``DataFrame``, the indexes must be disjoint but the columns do not
261-
need to be:
262-
263-
.. ipython:: python
264-
265-
result = df1.append(df4, sort=False)
266-
267-
.. ipython:: python
268-
:suppress:
269-
270-
@savefig merging_append2.png
271-
p.plot([df1, df4], result, labels=["df1", "df4"], vertical=True);
272-
plt.close("all");
273-
274-
``append`` may take multiple objects to concatenate:
275-
276-
.. ipython:: python
277-
278-
result = df1.append([df2, df3])
279-
280-
.. ipython:: python
281-
:suppress:
282-
283-
@savefig merging_append3.png
284-
p.plot([df1, df2, df3], result, labels=["df1", "df2", "df3"], vertical=True);
285-
plt.close("all");
286-
287-
.. note::
288-
289-
Unlike the :py:meth:`~list.append` method, which appends to the original list
290-
and returns ``None``, :meth:`~DataFrame.append` here **does not** modify
291-
``df1`` and returns its copy with ``df2`` appended.
292-
293240
.. _merging.ignore_index:
294241

295242
Ignoring indexes on the concatenation axis
@@ -309,19 +256,6 @@ do this, use the ``ignore_index`` argument:
309256
p.plot([df1, df4], result, labels=["df1", "df4"], vertical=True);
310257
plt.close("all");
311258
312-
This is also a valid argument to :meth:`DataFrame.append`:
313-
314-
.. ipython:: python
315-
316-
result = df1.append(df4, ignore_index=True, sort=False)
317-
318-
.. ipython:: python
319-
:suppress:
320-
321-
@savefig merging_append_ignore_index.png
322-
p.plot([df1, df4], result, labels=["df1", "df4"], vertical=True);
323-
plt.close("all");
324-
325259
.. _merging.mixed_ndims:
326260

327261
Concatenating with mixed ndims
@@ -473,14 +407,13 @@ like GroupBy where the order of a categorical variable is meaningful.
473407
Appending rows to a DataFrame
474408
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
475409

476-
While not especially efficient (since a new object must be created), you can
477-
append a single row to a ``DataFrame`` by passing a ``Series`` or dict to
478-
``append``, which returns a new ``DataFrame`` as above.
410+
If you have a series that you want to append as a single row to a ``DataFrame``, you can convert the row into a
411+
``DataFrame`` and use ``concat``
479412

480413
.. ipython:: python
481414
482415
s2 = pd.Series(["X0", "X1", "X2", "X3"], index=["A", "B", "C", "D"])
483-
result = df1.append(s2, ignore_index=True)
416+
result = pd.concat([df1, s2.to_frame().T], ignore_index=True)
484417
485418
.. ipython:: python
486419
:suppress:
@@ -493,20 +426,6 @@ You should use ``ignore_index`` with this method to instruct DataFrame to
493426
discard its index. If you wish to preserve the index, you should construct an
494427
appropriately-indexed DataFrame and append or concatenate those objects.
495428

496-
You can also pass a list of dicts or Series:
497-
498-
.. ipython:: python
499-
500-
dicts = [{"A": 1, "B": 2, "C": 3, "X": 4}, {"A": 5, "B": 6, "C": 7, "Y": 8}]
501-
result = df1.append(dicts, ignore_index=True, sort=False)
502-
503-
.. ipython:: python
504-
:suppress:
505-
506-
@savefig merging_append_dits.png
507-
p.plot([df1, pd.DataFrame(dicts)], result, labels=["df1", "dicts"], vertical=True);
508-
plt.close("all");
509-
510429
.. _merging.join:
511430

512431
Database-style DataFrame or named Series joining/merging

doc/source/user_guide/timeseries.rst

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2102,19 +2102,14 @@ The ``period`` dtype can be used in ``.astype(...)``. It allows one to change th
21022102
# change monthly freq to daily freq
21032103
pi.astype("period[D]")
21042104
2105+
# convert to DatetimeIndex
2106+
pi.astype("datetime64[ns]")
2107+
21052108
# convert to PeriodIndex
21062109
dti = pd.date_range("2011-01-01", freq="M", periods=3)
21072110
dti
21082111
dti.astype("period[M]")
21092112
2110-
.. deprecated:: 1.4.0
2111-
Converting PeriodIndex to DatetimeIndex with ``.astype(...)`` is deprecated and will raise in a future version. Use ``obj.to_timestamp(how).tz_localize(dtype.tz)`` instead.
2112-
2113-
.. ipython:: python
2114-
2115-
# convert to DatetimeIndex
2116-
pi.to_timestamp(how="start")
2117-
21182113
PeriodIndex partial string indexing
21192114
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
21202115

doc/source/whatsnew/v0.6.1.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Version 0.6.1 (December 13, 2011)
66

77
New features
88
~~~~~~~~~~~~
9-
- Can :ref:`append single rows <merging.append.row>` (as Series) to a DataFrame
9+
- Can append single rows (as Series) to a DataFrame
1010
- Add Spearman and Kendall rank :ref:`correlation <computation.correlation>`
1111
options to Series.corr and DataFrame.corr (:issue:`428`)
1212
- :ref:`Added <indexing.basics.get_value>` ``get_value`` and ``set_value`` methods to

doc/source/whatsnew/v0.7.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ New features
1919
intersection of the other axes. Improves performance of ``Series.append`` and
2020
``DataFrame.append`` (:issue:`468`, :issue:`479`, :issue:`273`)
2121

22-
- :ref:`Can <merging.concatenation>` pass multiple DataFrames to
22+
- Can pass multiple DataFrames to
2323
``DataFrame.append`` to concatenate (stack) and multiple Series to
2424
``Series.append`` too
2525

doc/source/whatsnew/v1.4.0.rst

Lines changed: 73 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ Additionally there are specific enhancements to the HTML specific rendering:
111111
- :meth:`.Styler.to_html` introduces keyword arguments ``sparse_index``, ``sparse_columns``, ``bold_headers``, ``caption``, ``max_rows`` and ``max_columns`` (:issue:`41946`, :issue:`43149`, :issue:`42972`).
112112
- :meth:`.Styler.to_html` omits CSSStyle rules for hidden table elements as a performance enhancement (:issue:`43619`)
113113
- Custom CSS classes can now be directly specified without string replacement (:issue:`43686`)
114+
- Ability to render hyperlinks automatically via a new ``hyperlinks`` formatting keyword argument (:issue:`45058`)
114115

115116
There are also some LaTeX specific enhancements:
116117

@@ -275,7 +276,7 @@ ignored when finding the concatenated dtype. These are now consistently *not* i
275276
276277
df1 = pd.DataFrame({"bar": [pd.Timestamp("2013-01-01")]}, index=range(1))
277278
df2 = pd.DataFrame({"bar": np.nan}, index=range(1, 2))
278-
res = df1.append(df2)
279+
res = pd.concat([df1, df2])
279280
280281
Previously, the float-dtype in ``df2`` would be ignored so the result dtype would be ``datetime64[ns]``. As a result, the ``np.nan`` would be cast to ``NaT``.
281282

@@ -364,10 +365,29 @@ second column is instead renamed to ``a.2``.
364365
365366
res
366367
367-
.. _whatsnew_140.notable_bug_fixes.notable_bug_fix3:
368+
.. _whatsnew_140.notable_bug_fixes.unstack_pivot_int32_limit:
368369

369-
notable_bug_fix3
370-
^^^^^^^^^^^^^^^^
370+
unstack and pivot_table no longer raises ValueError for result that would exceed int32 limit
371+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
372+
373+
Previously :meth:`DataFrame.pivot_table` and :meth:`DataFrame.unstack` would raise a ``ValueError`` if the operation
374+
could produce a result with more than ``2**31 - 1`` elements. This operation now raises a :class:`errors.PerformanceWarning`
375+
instead (:issue:`26314`).
376+
377+
*Previous behavior*:
378+
379+
.. code-block:: ipython
380+
381+
In [3]: df = DataFrame({"ind1": np.arange(2 ** 16), "ind2": np.arange(2 ** 16), "count": 0})
382+
In [4]: df.pivot_table(index="ind1", columns="ind2", values="count", aggfunc="count")
383+
ValueError: Unstacked DataFrame is too big, causing int32 overflow
384+
385+
*New behavior*:
386+
387+
.. code-block:: python
388+
389+
In [4]: df.pivot_table(index="ind1", columns="ind2", values="count", aggfunc="count")
390+
PerformanceWarning: The following operation may generate 4294967296 cells in the resulting pandas object.
371391
372392
.. ---------------------------------------------------------------------------
373393
@@ -398,7 +418,7 @@ If installed, we now require:
398418
+-----------------+-----------------+----------+---------+
399419
| pytest (dev) | 6.0 | | |
400420
+-----------------+-----------------+----------+---------+
401-
| mypy (dev) | 0.920 | | X |
421+
| mypy (dev) | 0.930 | | X |
402422
+-----------------+-----------------+----------+---------+
403423

404424
For `optional libraries <https://pandas.pydata.org/docs/getting_started/install.html>`_ the general recommendation is to use the latest version.
@@ -510,6 +530,49 @@ when given numeric data, but in the future, a :class:`NumericIndex` will be retu
510530
Out [4]: NumericIndex([1, 2, 3], dtype='uint64')
511531
512532
533+
.. _whatsnew_140.deprecations.frame_series_append:
534+
535+
Deprecated Frame.append and Series.append
536+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
537+
538+
:meth:`DataFrame.append` and :meth:`Series.append` have been deprecated and will be removed in Pandas 2.0.
539+
Use :func:`pandas.concat` instead (:issue:`35407`).
540+
541+
*Deprecated syntax*
542+
543+
.. code-block:: ipython
544+
545+
In [1]: pd.Series([1, 2]).append(pd.Series([3, 4])
546+
Out [1]:
547+
<stdin>:1: FutureWarning: The series.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
548+
0 1
549+
1 2
550+
0 3
551+
1 4
552+
dtype: int64
553+
554+
In [2]: df1 = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'))
555+
In [3]: df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'))
556+
In [4]: df1.append(df2)
557+
Out [4]:
558+
<stdin>:1: FutureWarning: The series.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
559+
A B
560+
0 1 2
561+
1 3 4
562+
0 5 6
563+
1 7 8
564+
565+
*Recommended syntax*
566+
567+
.. ipython:: python
568+
569+
pd.concat([pd.Series([1, 2]), pd.Series([3, 4])])
570+
571+
df1 = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'))
572+
df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'))
573+
pd.concat([df1, df2])
574+
575+
513576
.. _whatsnew_140.deprecations.other:
514577

515578
Other Deprecations
@@ -537,7 +600,6 @@ Other Deprecations
537600
- Deprecated casting behavior when passing an item with mismatched-timezone to :meth:`DatetimeIndex.insert`, :meth:`DatetimeIndex.putmask`, :meth:`DatetimeIndex.where` :meth:`DatetimeIndex.fillna`, :meth:`Series.mask`, :meth:`Series.where`, :meth:`Series.fillna`, :meth:`Series.shift`, :meth:`Series.replace`, :meth:`Series.reindex` (and :class:`DataFrame` column analogues). In the past this has cast to object dtype. In a future version, these will cast the passed item to the index or series's timezone (:issue:`37605`,:issue:`44940`)
538601
- Deprecated the 'errors' keyword argument in :meth:`Series.where`, :meth:`DataFrame.where`, :meth:`Series.mask`, and :meth:`DataFrame.mask`; in a future version the argument will be removed (:issue:`44294`)
539602
- Deprecated the ``prefix`` keyword argument in :func:`read_csv` and :func:`read_table`, in a future version the argument will be removed (:issue:`43396`)
540-
- Deprecated :meth:`PeriodIndex.astype` to ``datetime64[ns]`` or ``DatetimeTZDtype``, use ``obj.to_timestamp(how).tz_localize(dtype.tz)`` instead (:issue:`44398`)
541603
- Deprecated passing non boolean argument to sort in :func:`concat` (:issue:`41518`)
542604
- Deprecated passing arguments as positional for :func:`read_fwf` other than ``filepath_or_buffer`` (:issue:`41485`):
543605
- Deprecated passing ``skipna=None`` for :meth:`DataFrame.mad` and :meth:`Series.mad`, pass ``skipna=True`` instead (:issue:`44580`)
@@ -549,6 +611,7 @@ Other Deprecations
549611
- Deprecated downcasting column-by-column in :meth:`DataFrame.where` with integer-dtypes (:issue:`44597`)
550612
- Deprecated :meth:`DatetimeIndex.union_many`, use :meth:`DatetimeIndex.union` instead (:issue:`44091`)
551613
- Deprecated ``numeric_only=None`` in :meth:`DataFrame.rank`; in a future version ``numeric_only`` must be either ``True`` or ``False`` (the default) (:issue:`45036`)
614+
- Deprecated the behavior of :meth:`Timestamp.utcfromtimestamp`, in the future it will return a timezone-aware UTC :class:`Timestamp` (:issue:`22451`)
552615
- Deprecated :meth:`NaT.freq` (:issue:`45071`)
553616
-
554617

@@ -641,6 +704,7 @@ Datetimelike
641704
- Bug in :meth:`Index.insert` for inserting ``np.datetime64``, ``np.timedelta64`` or ``tuple`` into :class:`Index` with ``dtype='object'`` with negative loc adding ``None`` and replacing existing value (:issue:`44509`)
642705
- Bug in :meth:`Series.mode` with ``DatetimeTZDtype`` incorrectly returning timezone-naive and ``PeriodDtype`` incorrectly raising (:issue:`41927`)
643706
- Bug in :class:`DateOffset`` addition with :class:`Timestamp` where ``offset.nanoseconds`` would not be included in the result (:issue:`43968`, :issue:`36589`)
707+
- Bug in :meth:`Timestamp.fromtimestamp` not supporting the ``tz`` argument (:issue:`45083`)
644708
- Bug in :class:`DataFrame` construction from dict of :class:`Series` with mismatched index dtypes sometimes raising depending on the ordering of the passed dict (:issue:`44091`)
645709
-
646710

@@ -672,12 +736,14 @@ Conversion
672736
^^^^^^^^^^
673737
- Bug in :class:`UInt64Index` constructor when passing a list containing both positive integers small enough to cast to int64 and integers too large too hold in int64 (:issue:`42201`)
674738
- Bug in :class:`Series` constructor returning 0 for missing values with dtype ``int64`` and ``False`` for dtype ``bool`` (:issue:`43017`, :issue:`43018`)
739+
- Bug in constructing a :class:`DataFrame` from a :class:`PandasArray` containing :class:`Series` objects behaving differently than an equivalent ``np.ndarray`` (:issue:`43986`)
675740
- Bug in :class:`IntegerDtype` not allowing coercion from string dtype (:issue:`25472`)
676741
- Bug in :func:`to_datetime` with ``arg:xr.DataArray`` and ``unit="ns"`` specified raises TypeError (:issue:`44053`)
677742
- Bug in :meth:`DataFrame.convert_dtypes` not returning the correct type when a subclass does not overload :meth:`_constructor_sliced` (:issue:`43201`)
678743
- Bug in :meth:`DataFrame.astype` not propagating ``attrs`` from the original :class:`DataFrame` (:issue:`44414`)
679744
- Bug in :meth:`DataFrame.convert_dtypes` result losing ``columns.names`` (:issue:`41435`)
680745
- Bug in constructing a ``IntegerArray`` from pyarrow data failing to validate dtypes (:issue:`44891`)
746+
- Bug in :meth:`Series.astype` not allowing converting from a ``PeriodDtype`` to ``datetime64`` dtype, inconsistent with the :class:`PeriodIndex` behavior (:issue:`45038`)
681747
-
682748

683749
Strings
@@ -787,6 +853,7 @@ I/O
787853
- Bug in :func:`read_csv` not setting name of :class:`MultiIndex` columns correctly when ``index_col`` is not the first column (:issue:`38549`)
788854
- Bug in :func:`read_csv` silently ignoring errors when failing to create a memory-mapped file (:issue:`44766`)
789855
- Bug in :func:`read_csv` when passing a ``tempfile.SpooledTemporaryFile`` opened in binary mode (:issue:`44748`)
856+
- Bug in :func:`read_json` raising ``ValueError`` when attempting to parse json strings containing "://" (:issue:`36271`)
790857
-
791858

792859
Period

0 commit comments

Comments
 (0)