Pandas: Create A New Data Frame Using Multiple Groupby Results
My data is a Data Frame with retail items and their sales performance. Columns include: 2016 unit sales, 2015 unit sales, item description, etc. When I try to do a groupby for bran
Solution 1:
you can do it this way:
Data.groupby(by="Major Brand")["2016 Units","2015 Units"].sum()
Demo:
In [29]: Data.groupby(by="Major Brand")["2016 Units","2015 Units"].sum()
Out[29]:
2016 Units 2015 Units
Major Brand
1218238217212231922734176172
Data:
In[30]: DataOut[30]:
MajorBrand2016Units2015UnitsX017583xxx118295xxx238547xxx33140xxx414343xxx543565xxx633871xxx745690xxx83977xxx911817xxx1035938xxx1148517xxx1226413xxx1323233xxx1427676xxx
Solution 2:
I get the following error: TypeError: unorderable types: int() < str()
Could it be that your dtypes are not correct? Eg str. instead of int? You could try create your dataframe with something as follows:
In [18]: import numpy as np; import pandas as pd
In [19]: col1 = ['adidas','nike','yourturn','zara','nike','nike','bla','bla','zalando','amazon']
In [20]: data = {'Major Brand':col1, '2016 Units':range(len(col1)), '2015 Units':range(len(col1),len(col1)*2)}
In [21]: x = pd.DataFrame(data, dtype=np.int64 )
In [22]:
In [22]: x.groupby(by="Major Brand").sum()
Out[22]:
2015 Units 2016 Units
Major Brand
adidas 100
amazon 199
bla 3313
nike 4010
yourturn 122
zalando 188
zara 133
In [23]: x.groupby(by="Major Brand")["2016 Units","2015 Units"].sum()
Out[23]:
2016 Units 2015 Units
Major Brand
adidas 010
amazon 919
bla 1333
nike 1040
yourturn 212
zalando 818
zara 313
In [24]: x.dtypes
Out[24]:
2015 Units int64
2016 Units int64
Major Brand object
dtype: object
In [25]: x.groupby(by="Major Brand").agg(['count','sum','mean','median'])
Out[25]:
2015 Units 2016 Units
count sum mean median count sum mean median
Major Brand
adidas 11010.00000010.0100.0000000.0
amazon 11919.00000019.0199.0000009.0
bla 23316.50000016.52136.5000006.5
nike 34013.33333314.03103.3333334.0
yourturn 11212.00000012.0122.0000002.0
zalando 11818.00000018.0188.0000008.0
zara 11313.00000013.0133.0000003.0
Post a Comment for "Pandas: Create A New Data Frame Using Multiple Groupby Results"