Skip to content Skip to sidebar Skip to footer

Keep Upper N Rows Of A Pandas Dataframe Based On Condition

how would I delete all rows from a dataframe that come after a certain fulfilled condition? As an example I have the following dataframe: import pandas as pd xEnd=1 yEnd=2 df = pd

Solution 1:

To slice your dataframe until the first time a condition across 2 series are satisfied, first calculate the required index and then slice via iloc.

You can calculate the index via set_index, isin and np.ndarray.argmax:

idx = df.set_index(['x', 'y']).isin((xEnd, yEnd)).values.argmax()
res = df.iloc[:idx+1]

print(res)

   x  y  id
0  1  1   0
1  1  2   1

If you need better performance, see Efficiently return the index of the first value satisfying condition in array.


Solution 2:

not 100% sure i understand correctly, but you can filter your dataframe like this:

 df[(df.x <= xEnd) & (df.y <= yEnd)]

this yields the dataframe:

   id   x   y   
0   0   1   1   
1   1   1   2 

If x and y are not strictly increasing and you want whats above the line that satisfy condition:

 df[df.index <= (df[(df.x == xEnd) & (df.y == yEnd)]).index.tolist()]

Solution 3:

df = df.iloc[[0:yEnd-1],[:]]

Select just first two rows and keep all columns and put it in new dataframe. Or you can use the same name of variable too.


Post a Comment for "Keep Upper N Rows Of A Pandas Dataframe Based On Condition"