Pandas Group By Operations On A Data Frame
I have a pandas data frame like the one below. UsrId JobNos 1 4 1 56 2 23 2 55 2 41 2 5 3 78 1 25 3 1 I group by
Solution 1:
Something like df.groupby('UsrId').JobNos.sum().idxmax()
should do it:
In [1]: import pandas as pd
In [2]: from StringIO import StringIO
In [3]: data = """UsrId JobNos
...: 1 4
...: 1 56
...: 2 23
...: 2 55
...: 2 41
...: 2 5
...: 3 78
...: 1 25
...: 3 1"""
In [4]: df = pd.read_csv(StringIO(data), sep='\s+')
In [5]: grouped = df.groupby('UsrId')
In [6]: grouped.JobNos.sum()
Out[6]:
UsrId
1852124379
Name: JobNos
In [7]: grouped.JobNos.sum().idxmax()
Out[7]: 2
If you want your results based on the number of items in each group:
In [8]: grouped.size()
Out[8]:
UsrId
132432
In [9]: grouped.size().idxmax()
Out[9]: 2
Update: To get ordered results you can use the .order
method:
In [10]: grouped.JobNos.sum().order(ascending=False)
Out[10]:
UsrId
2124185379
Name: JobNos
Post a Comment for "Pandas Group By Operations On A Data Frame"