Selecting Rows Based On Multiple Column Values In Pandas Dataframe
I have a pandas DataFrame df: import pandas as pd data = {'Name': ['AAAA', 'BBBB'], 'C1': [25, 12], 'C2': [2, 1], 'C3': [1, 10]} df = pd.DataFrame(data) d
Solution 1:
I think below should do it, but its elegance is up for debate.
new_df = old_df[((old_df['C1'] > 0) & (old_df['C1'] < 20)) & ((old_df['C2'] > 0) & (old_df['C2'] < 20)) & ((old_df['C3'] > 0) & (old_df['C3'] < 20))]
Solution 2:
Shorter version:
In [65]:
df[(df>=0)&(df<=20)].dropna()
Out[65]:
Name C1 C2 C3
1 BBBB 12 1 10
Solution 3:
I like to use df.query() for these kind of things
df.query('C1>=0 and C1<=20 and C2>=0 and C2<=20 and C3>=0 and C3<=20')
Solution 4:
df.query("0 < C1 < 20 and 0 < C2 < 20 and 0 < C3 < 20")
or
df.query("0 < @df < 20").dropna()
Using @foo
in df.query
refers to the variable foo
in the environment.
Post a Comment for "Selecting Rows Based On Multiple Column Values In Pandas Dataframe"