Skip to content Skip to sidebar Skip to footer

Comparing And Replacing Values Inside Dataframes

I have two lists, one being the main list used as the 'key' and the other is the one being updated due to missing information. main_df: +---------+--------+--------+--------+------

Solution 1:

Try the following code which uses update() function.

import numpy as np
import pandas as pd

main_df = pd.read_csv('/home/Jian/Downloads/main.txt', sep='|')
data_df = pd.read_csv('/home/Jian/Downloads/data.csv')

Out[229]: 
      ID      LAT     LONG                        CITY STATE                 TIME
0  12345      NaN      NaN           Cape Hinchinbrook    AK  2015-06-27 21:03:19
1  12346      NaN      NaN              Delenia Island    AK  2015-06-27 21:03:19
2  12347  29.7401 -95.4636                     Houston    TX  2015-06-27 21:03:19
3  12348  41.7132 -83.7032                    Sylvania    OH  2015-06-27 21:03:19
4  12349      NaN      NaN                  Alaskaland    AK  2015-06-27 21:03:19
5  12350      NaN      NaN  Badger Road Baptist Church    AK  2015-06-27 21:03:19

main_df_part = main_df[['PRIM_LAT_DEC', 'PRIM_LONG_DEC','FEATURE_NAME', 'STATE_ALPHA']]
main_df_part.columns = ['LAT', 'LONG', 'CITY', 'STATE']
main_df_part = main_df_part.set_index(['CITY', 'STATE'])

Out[230]: 
                                      LAT      LONG
CITY                       STATE                   
Pacific Ocean              CA     39.3103 -123.8447
Cape Hinchinbrook          AK     60.2347 -146.6417
Delenia Island             AK     60.3394 -148.1383
Alaskaland                 AK     64.8394 -147.7700
Badger Road Baptist Church AK     64.8167 -147.5661
Barnes Creek               AK     65.0014 -147.2939
Barnette Magnet School     AK     64.8383 -147.7300
Bentley Park               AK     64.8364 -147.6942

data_df = data_df.set_index(['CITY', 'STATE'])

Out[233]: 
                                     ID      LAT     LONG                 TIME
CITY                       STATE                                              
Cape Hinchinbrook          AK     12345      NaN      NaN  2015-06-27 21:03:19
Delenia Island             AK     12346      NaN      NaN  2015-06-27 21:03:19
Houston                    TX     12347  29.7401 -95.4636  2015-06-27 21:03:19
Sylvania                   OH     12348  41.7132 -83.7032  2015-06-27 21:03:19
Alaskaland                 AK     12349      NaN      NaN  2015-06-27 21:03:19
Badger Road Baptist Church AK     12350      NaN      NaN  2015-06-27 21:03:19


data_df.update(main_df_part)

Out[235]: 
                                     ID      LAT      LONG                 TIME
CITY                       STATE                                               
Cape Hinchinbrook          AK     12345  60.2347 -146.6417  2015-06-27 21:03:19
Delenia Island             AK     12346  60.3394 -148.1383  2015-06-27 21:03:19
Houston                    TX     12347  29.7401  -95.4636  2015-06-27 21:03:19
Sylvania                   OH     12348  41.7132  -83.7032  2015-06-27 21:03:19
Alaskaland                 AK     12349  64.8394 -147.7700  2015-06-27 21:03:19
Badger Road Baptist Church AK     12350  64.8167 -147.5661  2015-06-27 21:03:19

Post a Comment for "Comparing And Replacing Values Inside Dataframes"