Skip to content

Conversation

@jbrockmendel
Copy link
Member

We're very nearly done with tslibs.timezones. I had hoped to bring over the remaining functions in smaller chunks, but this turns out to be the smallest independent subset that contains _get_dst_info and maybe_get_tz.

Getting _get_dst_info and maybe_get_tz separated is a milestone because it allows us to move tseries.frequencies.Resolution and related functions into cython without having a dependency on tslib. This in turn gets the dependency on khash out of tslib. A few more nice things become feasible.

This is almost pure cut/paste. The only change I made was replacing isinstance(tz, string_types) with is_string_object(tz). If requested, I'll do a follow-up to de-privatize the names.

timezones is within spitting distance of being valid python. Getting it over that hump would allow linting and coverage that is tough as it is.

Note that if/when #17363 is merged, a bunch of residual imports in tslib can be cleaned up.

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine. need to get passing of course. also run some asv for timezones to confirm no changes.



cpdef int64_t period_ordinal_to_dt64(int64_t ordinal, int freq) nogil:
cpdef period_ordinal_to_dt64(int64_t ordinal, int freq) nogil:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a mistake-- and apparently the cause of the build errors. Just pushed a fix.

@jreback jreback self-assigned this Sep 14, 2017
@jreback jreback added the Internals Related to non-user accessible pandas implementation label Sep 14, 2017
@jbrockmendel
Copy link
Member Author

asv continuous -f 1.1 -E virtualenv master HEAD -b tz -b timezone [...] before after ratio [fa557f73] [30c3fa6b] - 1.58ms 1.07ms 0.68 timeseries.DatetimeIndex.time_reset_index_tz SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY. 
@jreback
Copy link
Contributor

jreback commented Sep 14, 2017

can you run benches for -b timeseries

@jbrockmendel
Copy link
Member Author

asv continuous -f 1.1 -E virtualenv master HEAD -b timeseries before after ratio [fa557f73] [109d605e] + 234±3μs 527±70μs 2.25 timeseries.SemiMonthOffset.time_begin_decr + 789±10μs 1.33±0.08ms 1.69 timeseries.Offsets.time_custom_bmonthend_incr_n + 241±3μs 405±30μs 1.68 timeseries.SemiMonthOffset.time_begin_decr_n + 348μs 509μs 1.46 timeseries.DatetimeIndex.time_unique + 6.80±0.2ms 9.74±0.5ms 1.43 timeseries.ResampleDataFrame.time_mean_string + 6.99±0.07ms 9.94±0.8ms 1.42 timeseries.ResampleDataFrame.time_min_numpy + 13.0ms 18.3±0.4ms 1.41 timeseries.ToDatetime.time_iso8601_nosep + 2.13s 2.87s 1.35 timeseries.SemiMonthOffset.time_begin_apply_index + 25.0±0.7ms 33.2±0.9ms 1.33 timeseries.TimeSeries.time_add_irregular + 7.24±0.2ms 9.60±0.5ms 1.33 timeseries.ResampleDataFrame.time_min_string + 1.75s 2.22s 1.27 timeseries.SemiMonthOffset.time_end_incr_rng + 6.70±0.04ms 8.44±0.2ms 1.26 timeseries.TimeSeries.time_large_lookup_value + 250±5μs 313±7μs 1.25 timeseries.TimeSeries.time_sort_index_monotonic + 15.2±0.1ms 19.0ms 1.25 timeseries.ToDatetime.time_format_YYYYMMDD + 1.78s 2.19s 1.23 timeseries.SemiMonthOffset.time_end_apply_index + 37.3±0.9ms 45.7±2ms 1.23 timeseries.Iteration.time_iter_datetimeindex_preexit + 149±4μs 182±3μs 1.22 timeseries.Offsets.time_timeseries_day_incr + 7.61±0.2ms 9.21±2ms 1.21 timeseries.ResampleDataFrame.time_mean_numpy + 143±4μs 173±4μs 1.21 timeseries.Offsets.time_timeseries_day_apply + 17.7±0.5ms 21.0±0.1ms 1.19 timeseries.TimeDatetimeConverter.time_convert + 84.7±0.6μs 100±4μs 1.18 timeseries.Offsets.time_custom_bday_apply + 892ms 1.05s 1.18 timeseries.ToDatetime.time_iso8601_tz_spaceformat + 14.9±0.1ms 17.5±1ms 1.17 timeseries.ResampleSeries.time_timestamp_downsample_mean + 2.01s 2.32s 1.15 timeseries.SemiMonthOffset.time_begin_decr_rng + 234±0.9μs 266±10μs 1.13 timeseries.SemiMonthOffset.time_end_decr + 96.6±3μs 109±3μs 1.13 timeseries.Offsets.time_custom_bday_apply_dt64 + 48.0±1μs 54.0±2μs 1.12 timeseries.Offsets.time_timeseries_year_apply + 3.72s 4.17s 1.12 timeseries.SeriesArithmetic.time_add_offset_slow + 19.9±0.5ms 22.0±0.9ms 1.11 timeseries.TimeSeries.time_sort_index_non_monotonic + 1.64s 1.81s 1.10 timeseries.SeriesArithmetic.time_add_offset_fast + 1.61ms 1.78ms 1.10 timeseries.DatetimeIndex.time_reset_index_tz - 2.20s 1.98s 0.90 timeseries.SemiMonthOffset.time_begin_incr_rng - 9.25±0.2ms 8.11±0.1ms 0.88 timeseries.ResampleDataFrame.time_max_string - 56.4ms 49.4ms 0.88 timeseries.DatetimeIndex.time_infer_freq_none - 15.5ms 13.6±0.4ms 0.87 timeseries.ToDatetime.time_iso8601_format - 9.38ms 8.14ms 0.87 timeseries.DatetimeIndex.time_normalize - 8.87ms 7.55ms 0.85 timeseries.DatetimeIndex.time_infer_dst - 34.1±0.9ms 28.1±0.3ms 0.82 timeseries.ResampleSeries.time_period_downsample_mean - 305±10μs 244±1μs 0.80 timeseries.TimeSeries.time_timeseries_slice_minutely - 18.2ms 13.7ms 0.76 timeseries.DatetimeIndex.time_dti_factorize - 22.9ms 16.4ms 0.72 timeseries.DatetimeIndex.time_dti_tz_factorize - 1.82ms 1.30ms 0.71 timeseries.DatetimeIndex.time_reset_index - 5.19ms 2.97ms 0.57 timeseries.DatetimeIndex.time_add_offset_delta - 23.3ms 12.5ms 0.54 timeseries.DatetimeIndex.time_add_offset_fast - 38.1±1ms 18.4±0.7ms 0.48 timeseries.AsOfDataFrame.time_asof_nan_single - 6.16ms 2.96ms 0.48 timeseries.DatetimeIndex.time_add_timedelta - 19.1±0.2ms 8.98±0.3ms 0.47 timeseries.DatetimeAccessor.time_dt_accessor_normalize - 2.54s 1.19s 0.47 timeseries.DatetimeIndex.time_add_offset_slow - 23.1±0.8ms 10.8±0.2ms 0.47 timeseries.AsOf.time_asof_nan - 36.4±2ms 16.9±0.4ms 0.46 timeseries.AsOfDataFrame.time_asof_single - 12.5±0.8ms 5.72±0.1ms 0.46 timeseries.AsOf.time_asof_nan_single - 139±4μs 63.2±2μs 0.46 timeseries.AsOf.time_asof_single - 25.2±0.9ms 11.0±0.3ms 0.44 timeseries.AsOf.time_asof - 143±10ms 61.7±10ms 0.43 timeseries.AsOfDataFrame.time_asof_nan - 128±20ms 48.7±10ms 0.38 timeseries.AsOfDataFrame.time_asof - 149±10μs 50.9±1μs 0.34 timeseries.AsOf.time_asof_single_early - 547±50μs 166±7μs 0.30 timeseries.AsOfDataFrame.time_asof_single_early - 153±4μs 44.4±1μs 0.29 timeseries.DatetimeAccessor.time_dt_accessor - 14.0±0.5ms 2.62±0.09ms 0.19 frame_methods.frame_assign_timeseries_index.time_frame_assign_timeseries_index SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY. 
@jreback
Copy link
Contributor

jreback commented Sep 14, 2017

those are wildly inconsitent. do are using affinity?

@jbrockmendel
Copy link
Member Author

Did the previous asv run on the laptop. New one on the desktop:

asv continuous -f 1.1 -E virtualenv master HEAD -b timeseries [...] [fa557f73] [109d605e] + 184±1ms 204±5ms 1.11 timeseries.SeriesArithmetic.time_add_offset_delta - 202±10ms 182±0.8ms 0.90 timeseries.SeriesArithmetic.time_add_offset_fast - 653±5ms 586±1ms 0.90 timeseries.SeriesArithmetic.time_add_offset_slow - 9.57±0.04μs 7.68±0.04μs 0.80 timeseries.Offsets.time_timeseries_year_apply 
@jreback
Copy link
Contributor

jreback commented Sep 15, 2017

ok that looks good. something failing.ping when green.

@codecov
Copy link

codecov bot commented Sep 15, 2017

Codecov Report

Merging #17526 into master will increase coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@ Coverage Diff @@ ## master #17526 +/- ## ========================================== + Coverage 91.18% 91.2% +0.02%  ========================================== Files 163 163 Lines 49545 49606 +61 ========================================== + Hits 45179 45245 +66  + Misses 4366 4361 -5
Flag Coverage Δ
#multiple 88.99% <ø> (+0.04%) ⬆️
#single 40.19% <ø> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/plotting/_converter.py 63.23% <0%> (-1.82%) ⬇️
pandas/core/frame.py 97.77% <0%> (-0.1%) ⬇️
pandas/core/dtypes/concat.py 98.26% <0%> (-0.03%) ⬇️
pandas/core/indexes/datetimes.py 95.53% <0%> (ø) ⬆️
pandas/core/indexes/category.py 98.55% <0%> (ø) ⬆️
pandas/tseries/offsets.py 97.18% <0%> (+0.03%) ⬆️
pandas/core/indexes/period.py 92.77% <0%> (+0.03%) ⬆️
pandas/core/categorical.py 95.57% <0%> (+0.04%) ⬆️
pandas/core/common.py 91.98% <0%> (+0.32%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa557f7...0147c24. Read the comment docs.

@codecov
Copy link

codecov bot commented Sep 15, 2017

Codecov Report

Merging #17526 into master will increase coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@ Coverage Diff @@ ## master #17526 +/- ## ========================================== + Coverage 91.18% 91.2% +0.02%  ========================================== Files 163 163 Lines 49545 49606 +61 ========================================== + Hits 45179 45245 +66  + Misses 4366 4361 -5
Flag Coverage Δ
#multiple 88.99% <ø> (+0.04%) ⬆️
#single 40.19% <ø> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/plotting/_converter.py 63.23% <0%> (-1.82%) ⬇️
pandas/core/frame.py 97.77% <0%> (-0.1%) ⬇️
pandas/core/dtypes/concat.py 98.26% <0%> (-0.03%) ⬇️
pandas/core/indexes/datetimes.py 95.53% <0%> (ø) ⬆️
pandas/core/indexes/category.py 98.55% <0%> (ø) ⬆️
pandas/tseries/offsets.py 97.18% <0%> (+0.03%) ⬆️
pandas/core/indexes/period.py 92.77% <0%> (+0.03%) ⬆️
pandas/core/categorical.py 95.57% <0%> (+0.04%) ⬆️
pandas/core/common.py 91.98% <0%> (+0.32%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa557f7...0147c24. Read the comment docs.

@jbrockmendel
Copy link
Member Author

ping.

@jreback jreback added Clean Timezones Timezone data dtype labels Sep 15, 2017
@jreback jreback added this to the 0.21.0 milestone Sep 15, 2017
@jreback jreback merged commit 328c7e1 into pandas-dev:master Sep 15, 2017
@jreback
Copy link
Contributor

jreback commented Sep 15, 2017

thanks; would take a followup for _ renaming as well.

@jbrockmendel jbrockmendel deleted the tslibs-timezones5 branch September 15, 2017 15:30
alanbato pushed a commit to alanbato/pandas that referenced this pull request Nov 10, 2017
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Clean Internals Related to non-user accessible pandas implementation Timezones Timezone data dtype

2 participants