Keep Upper N Rows Of A Pandas Dataframe Based On Condition

May 24, 2023 Post a Comment

how would I delete all rows from a dataframe that come after a certain fulfilled condition? As an example I have the following dataframe: import pandas as pd xEnd=1 yEnd=2 df = pd

Solution 1:

To slice your dataframe until the first time a condition across 2 series are satisfied, first calculate the required index and then slice via iloc.

You can calculate the index via set_index, isin and np.ndarray.argmax:

idx = df.set_index(['x', 'y']).isin((xEnd, yEnd)).values.argmax()
res = df.iloc[:idx+1]

print(res)

   x  y  id
0  1  1   0
1  1  2   1

If you need better performance, see Efficiently return the index of the first value satisfying condition in array.

Solution 2:

not 100% sure i understand correctly, but you can filter your dataframe like this:

 df[(df.x <= xEnd) & (df.y <= yEnd)]

this yields the dataframe:

   id   x   y   
0   0   1   1   
1   1   1   2

If x and y are not strictly increasing and you want whats above the line that satisfy condition:

 df[df.index <= (df[(df.x == xEnd) & (df.y == yEnd)]).index.tolist()]

Solution 3:

df = df.iloc[[0:yEnd-1],[:]]

Select just first two rows and keep all columns and put it in new dataframe. Or you can use the same name of variable too.

Getting Started with Python

Keep Upper N Rows Of A Pandas Dataframe Based On Condition

Solution 1:

Solution 2:

Solution 3:

Post a Comment for "Keep Upper N Rows Of A Pandas Dataframe Based On Condition"