Skip to content Skip to sidebar Skip to footer

Extract Business Days In Time Series Using Python/pandas

I am working with high frequency data in Time Series and I would like to get all the business days from my data. My data observations are separated by seconds, so there are 86400 s

Solution 1:

Unfortunately this is a little slow, but should at least give the answer you are looking for.

#create an index of just the date portion of your index (this is the slow step)ts_days = pd.to_datetime(ts.index.date)

#create a range of business days over that periodbdays = pd.bdate_range(start=ts.index[0].date(), end=ts.index[-1].date())

#Filter the series to just those days contained in the business day range.ts = ts[ts_days.isin(bdays)]

Solution 2:

Modern pandas stores timestamps as numpy.datetime64 with a nanosecond time unit (one could check that by inspecting ts.index.values). It is much faster to convert both the original index and the one generated by bdate_range to a daily time unit ([D]) and to check the inclusion on these two arrays:

import numpy as np
import pandas

def_get_days_array(index):
    "Convert the index to a datetime64[D] array"return index.values.astype('<M8[D]')

defretain_business_days(ts):
    "Retain only the business days"
    tsdays = _get_days_array(ts.index) 
    bdays = _get_days_array(pandas.bdate_range(tsdays[0], tsdays[-1]))
    mask = np.in1d(tsdays, bdays)
    return ts[mask]

Post a Comment for "Extract Business Days In Time Series Using Python/pandas"