How To Calculate Sum From One Column With Some Coditions In Python Pandas?
I have one pandas Dataframe which looks like below: df = pd.DataFrame({'sport_name': ['football','football','football','football','football','cricket','cricket','cricket','cricket'
Solution 1:
Use:
#convertcolumntoint
df['person_count'] = df['person_count'].astype(int)
#reshape foronecolumnfrom city and person_symbol columns
df1=df.set_index(['sport_name','person_name','person_count']).stack().reset_index(name='val')
print (df1)
sport_name person_name person_count level_3 val
0 football ramesh 10 city mumbai
1 football ramesh 10 person_symbol ram
2 football ramesh 14 city mumbai
3 football ramesh 14 person_symbol mum
4 football ramesh 25 city delhi
5 football ramesh 25 person_symbol mum
6 football ramesh 20 city delhi
7 football ramesh 20 person_symbol ram
8 football mohit 11 city pune
9 football mohit 11 person_symbol moh
10 cricket mahesh 34 city surat
11 cricket mahesh 34 person_symbol mah
12 cricket mahesh 23 city surat
13 cricket mahesh 23 person_symbol sur
14 cricket mahesh 43 city panji
15 cricket mahesh 43 person_symbol sur
16 cricket mahesh 34 city panji
17 cricket mahesh 34 person_symbol mah
#concatenate columns
a = df1['sport_name'] + '.' + df1['person_name'] + '.TOTAL.' + df1['val'] + '_count'#groupby by Series a and aggregate sum
df2 = df1['person_count'].groupby(a.rename('derived_symbol'), sort=False)
.sum()
.reset_index(name='person_count')
print (df2)
derived_symbol person_count
0 football.ramesh.TOTAL.mumbai_count 24
1 football.ramesh.TOTAL.ram_count 30
2 football.ramesh.TOTAL.mum_count 39
3 football.ramesh.TOTAL.delhi_count 45
4 football.mohit.TOTAL.pune_count 11
5 football.mohit.TOTAL.moh_count 11
6 cricket.mahesh.TOTAL.surat_count 57
7 cricket.mahesh.TOTAL.mah_count 68
8 cricket.mahesh.TOTAL.sur_count 66
9 cricket.mahesh.TOTAL.panji_count 77
Solution 2:
Here's one way
First change the person_count
type to numeric
In [2126]: df.person_count = df.person_count.astype(int)
Reshape your data to get city
and person_symbol
under one level, and then groupby
to get the Total count.
In [2127]: dff = (df.melt(id_vars=['sport_name', 'person_name', 'person_count'])
.groupby(['sport_name', 'person_name', 'value']).person_count.sum())
In [2128]: dff
Out[2128]:
sport_name person_name value
cricket mahesh mah 68
panji 77
sur 66
surat 57
football mohit moh 11
pune 11
ramesh delhi 45
mum 39
mumbai 24
ram 30
Name: person_count, dtype: int32
format
the index levels with custom format.
In [2129]:dff.index= ['{0}.{1}.TOTAL.{2}_count'.format(*idx)foridxindff.index]
In [2130]:dffOut[2130]:cricket.mahesh.TOTAL.mah_count68cricket.mahesh.TOTAL.panji_count77cricket.mahesh.TOTAL.sur_count66cricket.mahesh.TOTAL.surat_count57football.mohit.TOTAL.moh_count11football.mohit.TOTAL.pune_count11football.ramesh.TOTAL.delhi_count45football.ramesh.TOTAL.mum_count39football.ramesh.TOTAL.mumbai_count24football.ramesh.TOTAL.ram_count30Name:person_count,dtype:int32
Post a Comment for "How To Calculate Sum From One Column With Some Coditions In Python Pandas?"