Skip to content Skip to sidebar Skip to footer

Pandas Group By Operations On A Data Frame

I have a pandas data frame like the one below. UsrId JobNos 1 4 1 56 2 23 2 55 2 41 2 5 3 78 1 25 3 1 I group by

Solution 1:

Something like df.groupby('UsrId').JobNos.sum().idxmax() should do it:

In [1]: import pandas as pd

In [2]: from StringIO import StringIO

In [3]: data = """UsrId   JobNos
   ...:  1       4
   ...:  1       56
   ...:  2       23 
   ...:  2       55
   ...:  2       41
   ...:  2       5
   ...:  3       78
   ...:  1       25
   ...:  3       1"""

In [4]: df = pd.read_csv(StringIO(data), sep='\s+')

In [5]: grouped = df.groupby('UsrId')

In [6]: grouped.JobNos.sum()
Out[6]: 
UsrId
1852124379
Name: JobNos

In [7]: grouped.JobNos.sum().idxmax()
Out[7]: 2

If you want your results based on the number of items in each group:

In [8]: grouped.size()
Out[8]: 
UsrId
132432

In [9]: grouped.size().idxmax()
Out[9]: 2

Update: To get ordered results you can use the .order method:

In [10]: grouped.JobNos.sum().order(ascending=False)
Out[10]: 
UsrId
2124185379
Name: JobNos

Post a Comment for "Pandas Group By Operations On A Data Frame"