Skip to content Skip to sidebar Skip to footer

Merging Data On Date Time Column (posixct Format)

I want to merge two data frames on Date Time column dtype.date-time columns contain both similar and different values. But I am unable to merge them such that all unique date-time

Solution 1:

merge(df_so2, df_met, by ="Date_Time",all=T)

        Date_Time X.x POC Datum         Date_GMT Sample.Measurement MDL X.y air_temp_set_1 dew_point_temperature_set_1
12015-01-011:00NANA<NA><NA>NANA135.635.622015-01-012:00NANA<NA><NA>NANA235.635.632015-01-013:0012 WGS84 01/01/201509:002.30.2335.635.642015-01-014:0022 WGS84 01/01/201510:002.50.2433.833.852015-01-015:0032 WGS84 01/01/201511:002.10.2533.233.262015-01-016:0042 WGS84 01/01/201512:002.30.2633.833.872015-01-017:0052 WGS84 01/01/201513:001.10.2733.833.8

Solution 2:

merge on outer should get them all:

  • pandas.DataFrame.merge
  • outer: use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.
  • based upon your comment, you want all the dates, not just those shown in Expected Output
  • add the parameter, sort=True if you want them sorted by date
df_exp = pd.merge(df_so2, df_met, on='Date_Time', how='outer')

 X_x  POC  Datum        Date_Time          Date_GMT  Sample.Measurement  MDL  X_y  air_temp_set_1  dew_point_temperature_set_1
 1.02.0  WGS84  2015-01-01 3:00  01/01/2015 09:002.30.2335.635.62.02.0  WGS84  2015-01-01 4:00  01/01/201510:002.50.2433.833.83.02.0  WGS84  2015-01-01 5:00  01/01/201511:002.10.2533.233.24.02.0  WGS84  2015-01-01 6:00  01/01/201512:002.30.2633.833.85.02.0  WGS84  2015-01-01 7:00  01/01/201513:001.10.2733.833.8
 NaN  NaN    NaN  2015-01-01 1:00               NaN                 NaN  NaN    135.635.6
 NaN  NaN    NaN  2015-01-01 2:00               NaN                 NaN  NaN    235.635.6

without columns from df_met:

df_exp.drop(columns=['X_y', 'air_temp_set_1', 'dew_point_temperature_set_1'], inplace=True)
df_exp.rename(columns={'X_x': 'X'}, inplace=True)

   X  POC  Datum        Date_Time          Date_GMT  Sample.Measurement  MDL
 1.02.0  WGS84  2015-01-01 3:00  01/01/2015 09:002.30.22.02.0  WGS84  2015-01-01 4:00  01/01/201510:002.50.23.02.0  WGS84  2015-01-01 5:00  01/01/201511:002.10.24.02.0  WGS84  2015-01-01 6:00  01/01/201512:002.30.25.02.0  WGS84  2015-01-01 7:00  01/01/201513:001.10.2
 NaN  NaN    NaN  2015-01-01 1:00               NaN                 NaN  NaN
 NaN  NaN    NaN  2015-01-01 2:00               NaN                 NaN  NaN

Solution 3:

df_exp = pd.merge(df_so2, df_met, on='Date_Time', how='outer')

I got:

 POC   Datum        Date_Time           Date_GMT   Sample.Measurement   MDL   air_temp_set_1   dew_point_temperature_set_1   relative_humidity_set_1   wind_speed_set_1   cloud_layer_1_code_set_1   wind_direction_set_1   pressure_set_1d   weather_cond_code_set_1   visibility_set_1  wind_cardinal_direction_set_1d  weather_condition_set_1d
    2  WGS84   2015-01-01 3:00  01/01/2015 09:002.30.235.635.6100.00.014.00.029.9433339.00.25                              N                       Fog
    1  WGS84   2015-01-01 3:00  01/01/2015 09:000.62.035.635.6100.00.014.00.029.9433339.00.25                              N                       Fog
    1  WGS84   2015-01-01 3:00  01/01/201512:007.40.235.635.6100.00.014.00.029.9433339.00.25                              N                       Fog
    1  WGS84   2015-01-01 3:00  01/01/201510:001.00.235.6                           NaN                       NaN                NaN                        NaN                    NaN               NaN                       NaN                NaN                             NaN                      NaN

Notes:

  • Check df_met.info() and df_so2.info() and verify Date_Time is non-null datetime64[ns]
  • If not, try the following:
  • df_so2.Date_Time = pd.to_datetime(df_so2.Date_Time)
  • df_met.Date_Time = pd.to_datetime(df_met.Date_Time)

Post a Comment for "Merging Data On Date Time Column (posixct Format)"