Skip to content Skip to sidebar Skip to footer

Transformation Of Transactions To Numpy Array

I have a list of daily transactional data in the following format: person, itemCode, transDate, amount I would like to sum the amount column by person and itemCode and transform my

Solution 1:

As @DSM said, this operations is looks like a job for pandas:

>>>from StringIO import StringIO>>>import pandas as pd>>>data = '''A, 1, 2013-10-10, .5...A, 1, 2013-10-18, .75...A, 2, 2013-10-20, 2.5...B, 1, 2013-10-09, .25...B, 2, 2014-10-20, .8'''...>>>df = pd.read_csv(StringIO(data), names=['person','itemCode','transDate','amount'], skiprows=0)>>>df
  person  itemCode    transDate  amount
0      A         1   2013-10-10    0.50
1      A         1   2013-10-18    0.75
2      A         2   2013-10-20    2.50
3      B         1   2013-10-09    0.25
4      B         2   2014-10-20    0.80
>>>grouped = df.groupby(['person'])>>>res = df.groupby(['person']).apply(lambda x: pd.Series(x.groupby('itemCode').sum()['amount']))>>>res
itemCode     1    2
person             
A         1.25  2.5
B         0.25  0.8

The result is pandas.DataFrame, but if you want to get it as numpy array, you can use values attribute:

>>> res.values
array([[ 1.25,  2.5 ],
       [ 0.25,  0.8 ]])

Post a Comment for "Transformation Of Transactions To Numpy Array"