Apply Expanding Function On Dataframe

April 19, 2024 Post a Comment

I have a function that I wish to apply to a subsets of a pandas DataFrame, so that the function is calculated on all rows (until current row) from the same group - i.e. using a gro

Solution 1:

An possible solution is to make the expanding part of the function, and use GroupBy.apply:

deffoo1(_df):
    return _df['x1'].expanding().max() * _df['x2'].expanding().apply(lambda x: x[-1], raw=True)

df['foo_result'] = df.groupby('group').apply(foo1).reset_index(level=0, drop=True)
print (df)
  group  time   x1  x2  foo_result
0     A     110110.03     B     11002200.01     A     240280.04     B     220000.02     A     330140.05     B     33003900.0

This is not a direct solution to the problem of applying a dataframe function to an expanding dataframe, but it achieves the same functionality.

Solution 2:

Applying a dataframe function on an expanding window is apparently not possible (at least for not pandas version 0.23.0), as one can see by plugging a print statement into the function.

Running df.groupby('group').expanding().apply(lambda x: bool(print(x)) , raw=False) on the given DataFrame (where the bool around the print is just to get a valid return value) returns:

01.0
dtype: float6401.012.0
dtype: float6401.012.023.0
dtype: float64010.0
dtype: float64010.0140.0
dtype: float64010.0140.0230.0
dtype: float64

(and so on - and also returns a dataframe with '0.0' in each cell, of course).

Baca Juga

This shows that the expanding window works on a column-by-column basis (we see that first the expanding time series is printed, then x1, and so on), and does not really work on a dataframe - so a dataframe function can't be applied to it.

So, to get the obtained functionality, one would have to put the expanding inside the dataframe function, like in the accepted answer.

Getting Started with Python

Apply Expanding Function On Dataframe

Solution 1:

Solution 2:

Post a Comment for "Apply Expanding Function On Dataframe"