Dataframe Resample On Date Ranges
I have a DataFrame that has the columns 'start_time' (datetime), 'end_time' (datetime), 'mode' and some other columns. There is no overlap in the ranges of different rows of the ta
Solution 1:
You can use:
#convert columns to datetimes if necessarydf['start_time']= pd.to_datetime(df['start_time'])
df['end_time']= pd.to_datetime(df['end_time'])
#subtract 10s for no last row from values from end_time columndf['end_time']= df['end_time'] - pd.Timedelta(10, unit='s')
#loop by list comprehension for list of date ranges#concat to one big DataFramedf1=(pd.concat([pd.Series(r.Index,pd.date_range(r.start_time,r.end_time,freq='10S'))forrindf.itertuples()]).reset_index())df1.columns= ['current_time','idx']
print(df1)current_timeidx02017-06-01 06:38:00 012017-06-01 06:38:10 022017-06-01 06:38:20 032017-06-01 06:38:30 042017-06-01 06:38:40 052017-06-01 06:38:50 062017-06-01 17:22:00 172017-06-01 17:22:10 182017-06-01 17:22:20 1
EDIT by comment of OP:
If use parameter closed=left
:
pd.date_range(r.start_time, r.end_time, freq='10S', closed='left')
then is possible omit subtracting.
#join all another columns by indexdf2=df1.set_index('idx').join(df.drop(['start_time','end_time'],1)).reset_index(drop=True)print(df2)current_timemode02017-06-01 06:38:00 x12017-06-01 06:38:10 x22017-06-01 06:38:20 x32017-06-01 06:38:30 x42017-06-01 06:38:40 x52017-06-01 06:38:50 x62017-06-01 17:22:00 y72017-06-01 17:22:10 y82017-06-01 17:22:20 y
Another solution:
#createcolumnfrom index forlastjoin (index values has to be unique)
df = df.reset_index()
#reshape dates to datetimeindex
df1 = (df.melt(df.columns.difference(['start_time','end_time']),
['start_time', 'end_time'],
value_name='current_time')
.drop('variable', 1)
.set_index('current_time'))
print (df1)
index mode
current_time2017-06-0106:38:000 x
2017-06-0117:22:001 y
2017-06-0106:38:500 x
2017-06-0117:22:201 y
#groupby index columnand resample, NaNs are replaced by forward filling
df2 = df1.groupby('index').resample('10S').ffill().reset_index(0, drop=True).drop('index', 1)
print (df2)
mode
current_time2017-06-0106:38:00 x
2017-06-0106:38:10 x
2017-06-0106:38:20 x
2017-06-0106:38:30 x
2017-06-0106:38:40 x
2017-06-0106:38:50 x
2017-06-0117:22:00 y
2017-06-0117:22:10 y
2017-06-0117:22:20 y
Post a Comment for "Dataframe Resample On Date Ranges"