BUG: Adding Series to empty DataFrame can reset dtype to float64

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas (1.3.0rc1).
(optional) I have confirmed this bug exists on the master branch of pandas

Code Sample, a copy-pastable example

import pandas as pd data = pd.array([0, 1, 2, 3], dtype="Int32") df = expected = pd.DataFrame({"data": pd.Series(data)}) result = pd.DataFrame(index=df.index) result.loc[df.index, "data"] = df["data"] print(df["data"].dtype) # prints: Int32 print(result["data"].dtype) # prints: float64 <--

Problem description

In my mind, this behavior seems unexpected because the provided dtype should be preserved and not coerced to the default type for an empty Series. This occurs for the nullable integer dtypes as well as Float32/Float64.

I came across this when trying to implement an ExtensionDtype that ended up failing on BaseSetitemTest. test_setitem_with_expansion_dataframe_column:

pandas/pandas/tests/extension/base/setitem.py

Lines 335 to 343 in 648eb40

     def test_setitem_with_expansion_dataframe_column(self, data, full_indexer):  
   # https://github.com/pandas-dev/pandas/issues/32395  
   df = expected = pd.DataFrame({"data": pd.Series(data)})  
   result = pd.DataFrame(index=df.index)  
    
   key = full_indexer(df)  
   result.loc[key, "data"] = df["data"]  
    
   self.assert_frame_equal(result, expected)  
 

Interestingly, in the tests for IntegerArray and FloatingArray, the test data includes NaN values which do not result in the coercion to float64:

import pandas as pd data = pd.array([0, pd.NaT, 2, 3], dtype="Int32") df = expected = pd.DataFrame({"data": pd.Series(data)}) result = pd.DataFrame(index=df.index) result.loc[df.index, "data"] = df["data"] print(df["data"].dtype) # prints: Int32 print(result["data"].dtype) # prints: Int32 <--

My expectation was that the dtype should be preserved in such cases, with/without NaN values.

Expected Output

I would expect that the dtype of the pd.Series being added to result would be preserved, in the case of the minimal example, result["data"] should be Int32Dtype.

print(df["data"].dtype) # prints: Int32 print(result["data"].dtype) # prints: Int32 <--

Output of `pd.show_versions()`

This was generated from the latest release candidate, but it appears to also occur on the master branch (1.4.0.dev0+56.g648eb40abc)

INSTALLED VERSIONS

commit : 2dd9e9b
python : 3.8.5.final.0
python-bits : 64
OS : Darwin
OS-release : 17.7.0
Version : Darwin Kernel Version 17.7.0: Fri Oct 30 13:34:27 PDT 2020; root:xnu-4570.71.82.8~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.0rc1
numpy : 1.20.3
pytz : 2021.1
dateutil : 2.8.1
pip : 21.1.2
setuptools : 49.6.0.post20210108
Cython : None
pytest : 6.2.1
hypothesis : None
sphinx : 3.3.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.24.1
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : 2021.05.0
fastparquet : None
gcsfs : None
matplotlib : 3.4.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.6.3
sqlalchemy : None
tables : None
tabulate : 0.8.7
xarray : None
xlrd : None
xlwt : None
numba : None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Adding Series to empty DataFrame can reset dtype to float64 #42099

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of `pd.show_versions()`

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	def test_setitem_with_expansion_dataframe_column(self, data, full_indexer):
	# https://github.com/pandas-dev/pandas/issues/32395
	df = expected = pd.DataFrame({"data": pd.Series(data)})
	result = pd.DataFrame(index=df.index)

	key = full_indexer(df)
	result.loc[key, "data"] = df["data"]

	self.assert_frame_equal(result, expected)

Uh oh!

BUG: Adding Series to empty DataFrame can reset dtype to float64 #42099

Description

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `pd.show_versions()`