Select Pandas Dataframe Rows Between Two Dates
I am working on two tables as follows: A first table df1 giving a rate and a validity period: rates = {'rate': [ 0.974, 0.966, 0.996, 0.998, 0.994, 1.006, 1.042, 1.072, 0.9
Solution 1:
If your dataframes are not very big, you can simply do the join on a dummy key and then do filtering to narrow it down to what you need. See example below (note that I had to update your example a little bit to have correct date formatting)
import pandas as pd
rates = {'rate': [ 0.974, 0.966, 0.996, 0.998, 0.994, 1.006, 1.042, 1.072, 0.954],
'valid_from': ['31/12/2018','15/01/2019','01/02/2019','01/03/2019','01/04/2019','15/04/2019','01/05/2019','01/06/2019','30/06/2019'],
'valid_to': ['14/01/2019','31/01/2019','28/02/2019','31/03/2019','14/04/2019','30/04/2019','31/05/2019','29/06/2019','31/07/2019']}
df1 = pd.DataFrame(rates)
df1['valid_to'] = pd.to_datetime(df1['valid_to'],format ='%d/%m/%Y')
df1['valid_from'] = pd.to_datetime(df1['valid_from'],format='%d/%m/%Y')
Then you df1
would be
ratevalid_fromvalid_to00.9742018-12-31 2019-01-1410.9662019-01-15 2019-01-3120.9962019-02-01 2019-02-2830.9982019-03-01 2019-03-3140.9942019-04-01 2019-04-1451.0062019-04-15 2019-04-3061.0422019-05-01 2019-05-3171.0722019-06-01 2019-06-2980.9542019-06-30 2019-07-31
This is your second data frame df2
data = {'date': ['03/01/2019','23/01/2019','27/02/2019','14/03/2019','05/04/2019','30/04/2019','14/06/2019'],
'amount': [200,305,155,67,95,174,236,]}
df2 = pd.DataFrame(data)
df2['date'] = pd.to_datetime(df2['date'],format ='%d/%m/%Y')
Then your df2
would look like the following
dateamount02019-01-03 20012019-01-23 30522019-02-27 15532019-03-14 6742019-04-05 9552019-04-30 17462019-06-14 236
Your solution:
df1['key'] = 1
df2['key'] = 1
df_output = pd.merge(df1, df2, on='key').drop('key',axis=1)
df_output = df_output[(df_output['date'] > df_output['valid_from']) & (df_output['date'] <= df_output['valid_to'])]
This is how would the result look like df_output
:
ratevalid_fromvalid_todateamount00.9742018-12-31 2019-01-14 2019-01-03 20080.9662019-01-15 2019-01-31 2019-01-23 305160.9962019-02-01 2019-02-28 2019-02-27 155240.9982019-03-01 2019-03-31 2019-03-14 67320.9942019-04-01 2019-04-14 2019-04-05 95401.0062019-04-15 2019-04-30 2019-04-30 174551.0722019-06-01 2019-06-29 2019-06-14 236
Post a Comment for "Select Pandas Dataframe Rows Between Two Dates"