Pandas P&l Rollup To The Next Business Day
I'm having a hard time trying to do this efficiently. I have some stocks and daily P&L info in a dataframe. In reality, I have millions of rows of data so efficiency matters a
Solution 1:
We can create the DataFrame of business dates then merge_asof
. Then we can group on this to get the sums.
import pandas as pd
from pandas.tseries.holiday import USFederalHolidayCalendar
#df['Date'] = pd.to_datetime(df.Date)
date_min = '2015-01-01'
date_max = '2016-12-31'
cal = USFederalHolidayCalendar()
holidays = cal.holidays(date_min, date_max).tolist()
df2 = pd.DataFrame({'bdate': pd.bdate_range(date_min, date_max,
holidays=holidays, freq='C')})
res = pd.merge_asof(df, df2, left_on='Date', right_on='bdate', direction='forward')
# Date Security P&L bdate#0 2016-01-01 AAPL 100 2016-01-04#1 2016-01-02 AAPL 200 2016-01-04#2 2016-01-03 AAPL 300 2016-01-04#3 2016-01-04 AAPL -200 2016-01-04
res.groupby(['Security', 'bdate'])['P&L'].sum()
#Security bdate #AAPL 2016-01-04 400
Solution 2:
IIUC you can do something like:
import pandas as pd
from pandas.tseries.holiday import USFederalHolidayCalendar
import numpy as np
date_min = '2015-01-01'
date_max = '2016-12-31'
cal = USFederalHolidayCalendar()
holidays = cal.holidays(date_min, date_max).tolist()
df = pd.DataFrame({"Date":pd.date_range(date_min, date_max)})
df["Security"] ="APPL"
df["P&L"] = np.random.randint(-1000, 1000, len(df))
df[~df["Date"].isin(holidays)].groupby("Security")\
.agg({"Date":"max",
"P&L":"sum"})\
.reset_index()
Post a Comment for "Pandas P&l Rollup To The Next Business Day"