Skip to content Skip to sidebar Skip to footer

Dataframe Resample On Date Ranges

I have a DataFrame that has the columns 'start_time' (datetime), 'end_time' (datetime), 'mode' and some other columns. There is no overlap in the ranges of different rows of the ta

Solution 1:

You can use:

#convert columns to datetimes if necessarydf['start_time']= pd.to_datetime(df['start_time'])
df['end_time']= pd.to_datetime(df['end_time'])
#subtract 10s for no last row from values from end_time columndf['end_time']= df['end_time'] - pd.Timedelta(10, unit='s')

#loop by list comprehension for list of date ranges#concat to one big DataFramedf1=(pd.concat([pd.Series(r.Index,pd.date_range(r.start_time,r.end_time,freq='10S'))forrindf.itertuples()]).reset_index())df1.columns= ['current_time','idx']
print(df1)current_timeidx02017-06-01 06:38:00    012017-06-01 06:38:10    022017-06-01 06:38:20    032017-06-01 06:38:30    042017-06-01 06:38:40    052017-06-01 06:38:50    062017-06-01 17:22:00    172017-06-01 17:22:10    182017-06-01 17:22:20    1

EDIT by comment of OP:

If use parameter closed=left:

pd.date_range(r.start_time, r.end_time, freq='10S', closed='left')

then is possible omit subtracting.


#join all another columns by indexdf2=df1.set_index('idx').join(df.drop(['start_time','end_time'],1)).reset_index(drop=True)print(df2)current_timemode02017-06-01 06:38:00    x12017-06-01 06:38:10    x22017-06-01 06:38:20    x32017-06-01 06:38:30    x42017-06-01 06:38:40    x52017-06-01 06:38:50    x62017-06-01 17:22:00    y72017-06-01 17:22:10    y82017-06-01 17:22:20    y

Another solution:

#createcolumnfrom index forlastjoin (index values has to be unique)
df = df.reset_index()
#reshape dates to datetimeindex
df1 = (df.melt(df.columns.difference(['start_time','end_time']),
              ['start_time', 'end_time'],
              value_name='current_time')
        .drop('variable', 1)
        .set_index('current_time'))
print (df1)
                     index mode
current_time2017-06-0106:38:000    x
2017-06-0117:22:001    y
2017-06-0106:38:500    x
2017-06-0117:22:201    y

#groupby index columnand resample, NaNs are replaced by forward filling
df2 = df1.groupby('index').resample('10S').ffill().reset_index(0, drop=True).drop('index', 1)
print (df2)
                    mode
current_time2017-06-0106:38:00    x
2017-06-0106:38:10    x
2017-06-0106:38:20    x
2017-06-0106:38:30    x
2017-06-0106:38:40    x
2017-06-0106:38:50    x
2017-06-0117:22:00    y
2017-06-0117:22:10    y
2017-06-0117:22:20    y

Post a Comment for "Dataframe Resample On Date Ranges"