-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Closed
Labels
Milestone
Description
Code Sample, a copy-pastable example if possible
import pandas as pd df = pd.DataFrame({'end_time': [pd.to_datetime('now', utc=True).tz_convert('Asia/Singapore')], 'id': [1]}) df['max_end_time'] = df.groupby('id').end_time.transform(max)df.info() shows
<class 'pandas.core.frame.DataFrame'> RangeIndex: 1 entries, 0 to 0 Data columns (total 3 columns): end_time 1 non-null datetime64[ns, Asia/Singapore] id 1 non-null int64 max_end_time 1 non-null datetime64[ns] dtypes: datetime64[ns, Asia/Singapore](1), datetime64[ns](1), int64(1) memory usage: 104.0 bytes df.to_dict() shows
{'end_time': {0: Timestamp('2018-12-10 17:08:52.630644+0800', tz='Asia/Singapore')}, 'id': {0: 1}, 'max_end_time': {0: Timestamp('2018-12-10 09:08:52.630644')}}Problem description
The timezone is dropped silently and timestamp converted to UTC after groupby - transform operation on tz aware datetime column
Expected Output
assert df['end_time'] == df['max_end_time']
Output of pd.show_versions()
``` INSTALLED VERSIONS ------------------ commit: None python: 3.7.1.final.0 python-bits: 64 OS: Linux OS-release: 4.9.85-38.58.amzn1.x86_64 machine: x86_64 processor: byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8
pandas: 0.23.4
pytest: 3.10.0
pip: 18.1
setuptools: 40.5.0
Cython: 0.29
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.1.1
sphinx: None
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.8
feather: None
matplotlib: 3.0.1
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.3
html5lib: None
sqlalchemy: 1.2.13
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
</details>