Conditionally Fill Column With Value From Another Dataframe Based On Row Match In Pandas
I find myself lost trying to solve this problem (automating tax paperwork). I have two dataframes: one with the quarterly historical records of EUR/USD exchange rates, and another
Solution 1:
You can change your rates dataframe to include all the dates and then forward fill,create a column called "Currency" in your Rates Dataframe and then join the two df's on both the date & currency columns.
idx = pd.DataFrame(pd.date_range('2017-07-05', '2017-07-12'),columns=['Date'])
rates = pd.merge(idx,rates,how="left",on="Date")
rates['Currency'] = 'USD'
rates['Rate'] = rates['Rate'].ffill()
Date Rate Currency
02017-07-051.1329 USD
12017-07-061.1385 USD
22017-07-071.1412 USD
32017-07-081.1412 USD
42017-07-091.1412 USD
52017-07-101.1387 USD
62017-07-111.1405 USD
72017-07-121.1449 USD
then doing a left join would give:
result= pd.merge(sales,rates,how="left",on=["Currency","Date"])
result['Rate'] = np.where(result['Currency'] =='EUR', 1, result['Rate_y'])
result= result.drop(['Rate_x','Rate_y'],axis =1)
would give:
DateFromCurrencyAmountRate02017-07-06 PayPalUSD1001.138512017-07-06 FastspringUSD2001.138522017-07-09 FastspringUSD1001.141232017-07-10 EUEUR1001.000042017-07-10 PayPalUSD2001.1387
Solution 2:
I break down the steps , by using pd.merge_asof
sales=pd.merge_asof(sales,rates,on='Date',direction='backward',allow_exact_matches =True)
sales.loc[sales.From=='EU','Rate_y']=sales.Rate_x
sales
Out[748]:
DateFrom Currency Amount Rate_x Rate_y
02017-07-06 PayPal USD 10011.138512017-07-06 Fastspring USD 20011.138522017-07-09 Fastspring USD 10011.141232017-07-10 EU EUR 10011.000042017-07-10 PayPal USD 20011.1387
Then
sales.drop('Rate_x',1).rename(columns={'Rate_y':'Rate'})Out[749]:DateFromCurrencyAmountRate02017-07-06 PayPalUSD1001.138512017-07-06 FastspringUSD2001.138522017-07-09 FastspringUSD1001.141232017-07-10 EUEUR1001.000042017-07-10 PayPalUSD2001.1387
Solution 3:
Here is how I would do it without merge. 1. Fill rates with missing dates and ffill as with other answers but keep Date as index. 2. Map this dataframe to sales, use loc to not include rows with EUR
idx = pd.date_range(rates['Date'].min(), rates['Date'].max())
rates = rates.set_index('Date').reindex(idx).ffill()
sales.loc[sales['Currency'] !='EUR','Rate'] = sales.loc[sales['Currency'] !='EUR','Date'].map(rates['Rate'])
DateFrom Currency Amount Rate
02017-07-06 PayPal USD 1001.138512017-07-06 Fastspring USD 2001.138522017-07-09 Fastspring USD 1001.141232017-07-10 EU EUR 1001.000042017-07-10 PayPal USD 2001.1387
Or you can even do it without changing the dataframe rates
mapper = rates.set_index('Date').reindex(sales['Date'].unique()).ffill()['Rate']
sales.loc[sales['Currency'] != 'EUR','Rate'] = sales.loc[sales['Currency'] != 'EUR','Date'].map(mapper)
Timetesting:
wen:0.011892538983374834gayatri:0.13312408898491412vaishali :0.009498710976913571
Post a Comment for "Conditionally Fill Column With Value From Another Dataframe Based On Row Match In Pandas"