Skip to content Skip to sidebar Skip to footer

Pandas Groupby Apply On Multiple Columns To Generate A New Column

I like to generate a new column in pandas dataframe using groupby-apply. For example, I have a dataframe: df = pd.DataFrame({'A':[1,2,3,4],'B':['A','B','A','B'],'C':[0,0,1,1]}) an

Solution 1:

For this case I do not think include the column A in apply is necessary, we can use transform

df.A-df.groupby('B').C.transform('mean')
Out[272]: 
00.511.522.533.5
dtype: float64

And you can assign it back

df['diff']= df.A-df.groupby('B').C.transform('mean')
df
Out[274]: 
   A  B  C  diff
0  1  A  0   0.5
1  2  B  0   1.5
2  3  A  1   2.5
3  4  B  1   3.5

Solution 2:

Let's use group_keys=False in the groupby

df.assign(D=df.groupby('B', group_keys=False).apply(lambda x: x.A - x.C.mean()))

Output:

AB  C    D
01A00.512B01.523A12.534B13.5

Post a Comment for "Pandas Groupby Apply On Multiple Columns To Generate A New Column"