Skip to content Skip to sidebar Skip to footer

Why Np.std() And Pivot_table(aggfunc=np.std) Return The Different Result

I have some code and do not understand why the difference occurs: np.std() which default ddof=0,when it's used alone. but why when it's used as an argument in pivot_table(aggfunc=n

Solution 1:

pivot uses DataFrame.groupby.agg and when you supply an aggregation function it's going to try to figure out exactly how to _aggregate.

arg=np.std will get handled here, the relevant code being

f = self._get_cython_func(arg)
if f andnot args andnot kwargs:
    returngetattr(self, f)(), None

Hidden in the DataFrame class is this table:

pd.DataFrame()._cython_table
#OrderedDict([(<functionsum>, 'sum'),#             (<function max>, 'max'),#             ...#             (<function numpy.std>, 'std'),#             (<function numpy.nancumsum>, 'cumsum')])

pd.DataFrame()._cython_table.get(np.std)
#'std'

And so np.std is only used to select the attribute to call, the default ddof are completely ignored, and instead the pandas default of ddof=1 is used.

getattr(dft['D'], 'std')()
#1.6669847417133286

Post a Comment for "Why Np.std() And Pivot_table(aggfunc=np.std) Return The Different Result"