Skip to content Skip to sidebar Skip to footer

How To Pad On Extra Rows In Dataframe For Neural Netowrk

I have a dataframe that looks like this: For each unique name, I want to ensure there are exactly 2 entries. If a name has more than 2 entries, I want the two entries with the lar

Solution 1:

A little hard than I though

df = df.sort_values(['Name','Score'],ascending=[True,False])
#sort_value first 
df = df.groupby('Name').head(2)
#get the top two rowpergroupby Score
df['id'] = df.groupby('Name').cumcount()
#get the unique count pergroup name , then we fill it up 
out= df.set_index(['Name','id'])\
           .reindex(pd.MultiIndex.from_product([df.Name.unique(),df.id.unique()],names=['name','id']))\
              .fillna(0).reset_index().drop('id',1)
outOut[273]: 
  name  Score  Ind1
0    A   34.01.01    A   31.03.02    B   40.02.03    B   33.03.04    C   21.05.05    C    0.00.0

Solution 2:

Drop duplicates from initial df

g=df.groupby('Name').head(2).reset_index(drop=True)

Extract unique values in Name into a list

l=list(set(df['Name'].to_list()))

Create new Series

s=pd.Series(list(np.repeat(l, 2)), name='Name')

Merge Series to df

g.merge(s.rename('new_Name'), left_index=True, right_index=True, how='right').fillna(0).drop('Name',1)



    ScoreInd1new_Name034.01.0A131.03.0A240.02.0B333.03.0B421.05.0C50.00.0C

Post a Comment for "How To Pad On Extra Rows In Dataframe For Neural Netowrk"