Skip to content Skip to sidebar Skip to footer

Python Pandas Dataframe Resample Daily Data To Week By Mon-sun Weekly Definition?

import pandas as pd import numpy as np dates = pd.date_range('20141229',periods=14, name='Day') df = pd.DataFrame({'Sum1': [1667, 1229, 1360, 9232, 8866, 4083, 3671, 10085, 10005,

Solution 1:

In case anyone else was not aware, it turns out that the weekly Anchored Offsets are based on the end date. So, just resampling 'W' (which is the same as 'W-SUN') is by default a Monday to Sunday sample. The date listed is the end date. See this old bug report wherein neither the documentation nor the API got updated.

Given that you specified label='left' in the resample parameters, you must have realized that fact. It's also why using 'W-MON' does not have the desired effect. What is confusing is that the left bound is not actually in the interval.

So, to display the start date for the period instead of the end date, you may add a day to the index. That would mean you would do:

df_resampled.index = df_resampled.index + pd.DateOffset(days=1)

For completeness, here is your original data with another day (a Sunday) added on the beginning to show the grouping really is Monday to Sunday:

importpandasaspdimportnumpyasnpdates=pd.date_range('20141228',periods=15,name='Day')df=pd.DataFrame({'Sum1': [10000, 1667, 1229, 1360, 9232, 8866, 4083, 3671, 10085, 10005, 8730, 10056, 10176, 3792, 3518],'Sum2': [10000, 91, 75, 75, 254, 239, 108, 99, 259, 395, 355, 332, 386, 96, 111],'Sum3': [10000, 365.95, 398.97, 285.12, 992.17, 1116.57, 512.11, 504.47, 1190.96, 1753.6, 1646.25, 1344.05, 1582.67, 560.95, 736.44],'Sum4': [10000, 5, 5, 1, 5, 8, 8, 2, 10, 12, 16, 16, 6, 6, 3]},index=dates);print(df)df_resampled=df.resample('W',how='sum',label='left')df_resampled.index=df_resampled.index-pd.DateOffset(days=1)print(df_resampled)

This outputs:

Sum1Sum2Sum3Sum4Day2014-12-28  100001000010000.00100002014-12-29   1667     91365.9552014-12-30   1229     75398.9752014-12-31   1360     75285.1212015-01-01   9232    254992.1752015-01-02   8866    2391116.57      82015-01-03   4083    108512.1182015-01-04   3671     99504.4722015-01-05  100852591190.96     102015-01-06  100053951753.60     122015-01-07   8730    3551646.25     162015-01-08  100563321344.05     162015-01-09  101763861582.67      62015-01-10   3792     96560.9562015-01-11   3518    111736.443Sum1Sum2Sum3Sum4Day2014-12-22  100001000010000.00100002014-12-29  301089414175.36     342015-01-05  563621934   8814.92     69

I believe that is what you wanted for Question 1.

Update

There is now a loffset argument to resample() that allows you to shift the label offset. So, instead of modifying the index, you simple add the loffset argument like so:

df.resample('W', how='sum', label='left', loffset=pd.DateOffset(days=1))

Also of note how=sum is now deprecated in favor of using .sum() on the Resampler object that .resample() returns. So, the fully updated call would be:

df_resampled = df.resample('W', label='left', loffset=pd.DateOffset(days=1)).sum()

Update 1.1.0

The handy loffset argument is deprecated as of version 1.1.0. The documentation indicates the shifting should be done after the resample. In this particular case, I believe that means this is the correct code (untested):

from pandas.tseries.frequencies import to_offset
df_resampled = df.resample('W', label='left').sum()
df_resampled.index = df_resampled.index + to_offset(pd.DateOffset(days=1))

Solution 2:

This might help.

importpandasaspdimportnumpyasnpdf=pd.DataFrame(np.random.randint(1,1000,(100,4)),columns='Sum1Sum2Sum3Sum4'.split(),index=pd.date_range('2014-12-29',periods=100,freq='D'))deffunc(group):returnpd.Series({'Sum1':group.Sum1.sum(),'Sum2':group.Sum2.sum(),'Sum3':group.Sum3.sum(),'Sum4':group.Sum4.sum(),'Day':group.index[1],'Period':'{0} - {1}'.format(group.index[0].date(),group.index[-1].date())})df.groupby(lambdaidx:idx.week).apply(func)Out[386]:DayPeriodSum1Sum2Sum3Sum412014-12-30  2014-12-29-2015-01-04  3559  3692  3648  408622015-01-06  2015-01-05-2015-01-11  2990  3658  3348  330432015-01-13  2015-01-12-2015-01-18  3168  3720  3518  327342015-01-20  2015-01-19-2015-01-25  2275  4968  4095  236652015-01-27  2015-01-26-2015-02-01  4146  2167  3888  4576....................112015-03-10  2015-03-09-2015-03-15  4035  3518  2588  2714122015-03-17  2015-03-16-2015-03-22  3399  3901  3430  2143132015-03-24  2015-03-23-2015-03-29  3227  3308  3185  3814142015-03-31  2015-03-30-2015-04-05  4278  3369  3623  4167152015-04-07  2015-04-06-2015-04-07  1466   6321136  1392

[15rowsx6columns]

Post a Comment for "Python Pandas Dataframe Resample Daily Data To Week By Mon-sun Weekly Definition?"