Replacing Nan Value With A Word When Nan Is Not Repeated In Two Consecutive Rows

December 27, 2023 Post a Comment

for the following data frame: index Sent col_1 col_2 col_3 1 AB NaN DD CC 1 0 1 0 2 SA FA FB NaN

Solution 1:

May be there is something better, but one way would be to try using shift to see a row above and a row below. However, for first and last row, it would create issue. So, if it is not a problem to add extra rows and remove it later, you can try following:

# Appending row to the top: https://stackoverflow.com/a/24284680/5916727
df.loc[-1]=[0for n inrange(len(df.columns))]
df.index = df.index +1# shifting index
df = df.sort_index()# sorting by index# Append row to below it
df.loc[df.shape[0]]=[0for n inrange(len(df.columns))]
print(df)

   index Sent col_1 col_2 col_3
00000011   AB   NaN    DD    CC
2101032   SA    FA    FB   NaN4211NaN53   FF   Sha   NaN    PA
63101700000

Now, check for consecutive rows using shift with masking by shift(-1) and shift(1):

columns = ["col_1", "col_2","col_3"]
for column in columns:
    df.loc[df[column].isnull() & df[column].shift(-1).notnull() &  df[column].shift(1).notnull(), column] = "F"df = df [1:-1] # remove extra rowsprint(df)

Output:

   index Sent col_1 col_2 col_3
11   AB     F    DD    CC
2101032   SA    FA    FB   NaN4211NaN53   FF   Sha     F    PA
63101

If you want you can remove extra index column as well which seems to have duplicates.

Update (adding .csv data tested with)

I had following in the test csv file.

index,Sent,col_1,col_2,col_3
1,AB,,DD,CC
1,,0,1,02,SA,FA,FB,NA2,,1,1,NA3,FF,Sha,,PA
3,,1,0,1

Then, you can use following to create input dataframe:

import pandas as pddf= pd.read_csv("FILENAME.csv")

Getting Started with Python

Replacing Nan Value With A Word When Nan Is Not Repeated In Two Consecutive Rows

Solution 1:

Update (adding .csv data tested with)

Post a Comment for "Replacing Nan Value With A Word When Nan Is Not Repeated In Two Consecutive Rows"