Change With Nan If Values Stuck At A Single Value Over Time Using Python
As you can see below, my contains some identical consecutive values, i.e. 1, 2, and 3. Date Value 0 2017-07-18 07:40:00 1 1 2017-07-18 07:45:00 1 2 2017-07-18 07:50:0
Solution 1:
You could GroupBy
consecutive values using a custom grouping scheme, check which groups have a size greater or equal to 3
and use the result to index the dataframe and set the rows of interest to NaN
:
g = df.Value.diff().fillna(0).ne(0).cumsum()
m = df.groupby(g).Value.transform('size').ge(3)
df.loc[m,'Value'] = np.nan
Date Value
0 2017-07-18-07:40:00 NaN
1 2017-07-18-07:45:00 NaN
2 2017-07-18-07:50:00 NaN
3 2017-07-18-07:55:00 2414.0
4 2017-07-18-08:00:00 2.0
5 2017-07-18-08:05:00 2.0
6 2017-07-18-08:10:00 4416.0
7 2017-07-18-08:15:00 4416.0
8 2017-07-18-08:20:00 NaN
9 2017-07-18-08:25:00 NaN
10 2017-07-18-08:30:00 NaN
11 2017-07-18-08:35:00 6998.0
Where:
df.assign(grouper=g, mask=m, result=df_.Value)
Date Value grouper mask result
0 2017-07-18-07:40:00 1 0 True NaN
1 2017-07-18-07:45:00 1 0 True NaN
2 2017-07-18-07:50:00 1 0 True NaN
3 2017-07-18-07:55:00 2414 1 False 2414.0
4 2017-07-18-08:00:00 2 2 False 2.0
5 2017-07-18-08:05:00 2 2 False 2.0
6 2017-07-18-08:10:00 4416 3 False 4416.0
7 2017-07-18-08:15:00 4416 3 False 4416.0
8 2017-07-18-08:20:00 3 4 True NaN
9 2017-07-18-08:25:00 3 4 True NaN
10 2017-07-18-08:30:00 3 4 True NaN
11 2017-07-18-08:35:00 6998 5 False 6998.0
Solution 2:
Count the values. The result is a series, it needs a name for further references:
counts = df['Value'].value_counts()
counts.name = '_'
Merge the select values from the series with the original dataframe:
keep = counts[counts < 3]
df.merge(keep, left_on='Value', right_index=True)[df.columns]
# Date Value
#3 2017-07-18 07:55:00 2414
#4 2017-07-18 08:00:00 2
#5 2017-07-18 08:05:00 2
#6 2017-07-18 08:10:00 4416
#7 2017-07-18 08:15:00 4416
#11 2017-07-18 08:35:00 6998
The result is a filtered dataframe.
If you use pandas version <0.24, you should upgrade, but here is a workaround:
df.merge(pd.DataFrame(keep), left_on='Value', right_index=True)[df.columns]
Post a Comment for "Change With Nan If Values Stuck At A Single Value Over Time Using Python"