Pandas | Time Series Manipulation

Pandas | Time Series Manipulation

Pandas is a powerful library in Python that provides extensive capabilities to manipulate time series data. Some key time series functionalities in Pandas include:

  1. Date Ranges:

    • Generating date ranges:
      import pandas as pd date_range = pd.date_range(start='2022-01-01', end='2022-01-10') 
  2. Shifting Data:

    • Shifting data points forward or backward:
      ts = pd.Series(range(10), index=date_range) shifted_forward = ts.shift(1) # shift data forward by 1 period shifted_backward = ts.shift(-1) # shift data backward by 1 period 
  3. Resampling:

    • Changing the frequency of your time series data:
      ts.resample('2D').sum() # Resample data into 2-day bins and sum values 
  4. Rolling Windows:

    • Apply operations over fixed-size windows:
      rolling_mean = ts.rolling(window=3).mean() # Calculate rolling mean over 3 periods 
  5. Time Zones:

    • Handle time zones:
      ts_utc = ts.tz_localize('UTC') # Localize time series to UTC ts_est = ts_utc.tz_convert('US/Eastern') # Convert to Eastern Time 
  6. Time Offsets:

    • Manipulate dates using offsets:
      from pandas.tseries.offsets import BDay next_business_day = pd.Timestamp('2022-01-01') + BDay(1) # Add one business day 
  7. Lagging and Leading:

    • Retrieve lagging or leading values:
      ts_lagged = ts.shift(1) # Lag values by one period ts_leaded = ts.shift(-1) # Lead values by one period 
  8. Time Deltas:

    • Differences between times:
      delta = pd.Timestamp('2022-01-10') - pd.Timestamp('2022-01-01') 
  9. Date Functionality:

    • Extracting date properties:
      day = ts.index.day month = ts.index.month year = ts.index.year 
  10. Interpolating Missing Dates:

    • Filling missing dates:
      idx = pd.date_range('2022-01-01', '2022-01-10') ts = pd.Series([1, 2, np.nan, np.nan, 5], index=pd.to_datetime(['2022-01-01', '2022-01-02', '2022-01-05', '2022-01-06', '2022-01-10'])) ts_reindexed = ts.reindex(idx) ts_interpolated = ts_reindexed.interpolate() 
  11. Slicing Time Series Data:

    • Easily slice data within a range of dates:
      subset = ts['2022-01-01':'2022-01-05'] 
  12. Frequency Conversion:

    • Convert frequencies, like converting daily data to monthly:
      ts.asfreq('M') 
  13. Plotting:

    • Pandas integrates with Matplotlib for easy time series plotting:
      ts.plot() plt.show() 

This provides a basic introduction to time series manipulation in Pandas. There are many more functionalities, and a deep dive into the Pandas documentation will provide additional methods and tricks for handling time series data effectively.


More Tags

advanced-custom-fields web-crawler mouseevent fluent-assertions uisegmentedcontrol random-forest deprecated gensim jcreator rest-client

More Programming Guides

Other Guides

More Programming Examples