Skip to content

DOC: Clarifiy fill_value behavior in arithmetic ops #19653

@HagaiHargil

Description

@HagaiHargil

When adding two DataFrames using df1.add(df2) one can use the fill_value parameter to fill in any NaNs that might come up. This parameter seems pretty broken:

a = pd.DataFrame(np.ones((3, 2))) b = pd.DataFrame(np.ones((4, 3))) print(a + b) # all good: # 0 1 2 # 0 2.0 2.0 NaN # 1 2.0 2.0 NaN # 2 2.0 2.0 NaN # 3 NaN NaN NaN print(a.add(b)) # all good print(a.add(b, fill_value=0)) # broken # 0 1 2 # 0 2.0 2.0 1.0 # 1 2.0 2.0 1.0 # 2 2.0 2.0 1.0 # 3 1.0 1.0 1.0

As you see, it filled the NaNs with 1.0. Changing fill_value=1 will fill everything with 2.0. However, changing some of the values inside b leads to more peculiar results, and I couldn't really connect the dots and find some pattern.

This was observed on Python 3.6 on both Linux and Windows.

Thanks.

``` commit: None python: 3.6.3.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-514.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 pandas: 0.22.0 pytest: None pip: 9.0.1 setuptools: 38.4.0 Cython: 0.27.3 numpy: 1.14.0 scipy: 1.0.0 pyarrow: None xarray: None IPython: 6.2.1 sphinx: 1.6.7 patsy: None dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.1.1 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.9999999 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None ```

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions