Keep Upper N Rows Of A Pandas Dataframe Based On Condition
how would I delete all rows from a dataframe that come after a certain fulfilled condition? As an example I have the following dataframe: import pandas as pd xEnd=1 yEnd=2 df = pd
Solution 1:
To slice your dataframe until the first time a condition across 2 series are satisfied, first calculate the required index and then slice via iloc
.
You can calculate the index via set_index
, isin
and np.ndarray.argmax
:
idx = df.set_index(['x', 'y']).isin((xEnd, yEnd)).values.argmax()
res = df.iloc[:idx+1]
print(res)
x y id
0 1 1 0
1 1 2 1
If you need better performance, see Efficiently return the index of the first value satisfying condition in array.
Solution 2:
not 100% sure i understand correctly, but you can filter your dataframe like this:
df[(df.x <= xEnd) & (df.y <= yEnd)]
this yields the dataframe:
id x y
0 0 1 1
1 1 1 2
If x and y are not strictly increasing and you want whats above the line that satisfy condition:
df[df.index <= (df[(df.x == xEnd) & (df.y == yEnd)]).index.tolist()]
Solution 3:
df = df.iloc[[0:yEnd-1],[:]]
Select just first two rows and keep all columns and put it in new dataframe. Or you can use the same name of variable too.
Post a Comment for "Keep Upper N Rows Of A Pandas Dataframe Based On Condition"