How To Pad On Extra Rows In Dataframe For Neural Netowrk
I have a dataframe that looks like this: For each unique name, I want to ensure there are exactly 2 entries. If a name has more than 2 entries, I want the two entries with the lar
Solution 1:
A little hard than I though
df = df.sort_values(['Name','Score'],ascending=[True,False])
#sort_value first
df = df.groupby('Name').head(2)
#get the top two rowpergroupby Score
df['id'] = df.groupby('Name').cumcount()
#get the unique count pergroup name , then we fill it up
out= df.set_index(['Name','id'])\
.reindex(pd.MultiIndex.from_product([df.Name.unique(),df.id.unique()],names=['name','id']))\
.fillna(0).reset_index().drop('id',1)
outOut[273]:
name Score Ind1
0 A 34.01.01 A 31.03.02 B 40.02.03 B 33.03.04 C 21.05.05 C 0.00.0
Solution 2:
Drop duplicates from initial df
g=df.groupby('Name').head(2).reset_index(drop=True)
Extract unique values in Name into a list
l=list(set(df['Name'].to_list()))
Create new Series
s=pd.Series(list(np.repeat(l, 2)), name='Name')
Merge Series to df
g.merge(s.rename('new_Name'), left_index=True, right_index=True, how='right').fillna(0).drop('Name',1)
ScoreInd1new_Name034.01.0A131.03.0A240.02.0B333.03.0B421.05.0C50.00.0C
Post a Comment for "How To Pad On Extra Rows In Dataframe For Neural Netowrk"