Skip to content Skip to sidebar Skip to footer

Using Pandas To Select Specific Seasons From A Dataframe Whose Values Are Over A Defined Threshold

Apologies for the wall of code, but I can't shorten it further... I want to sample climate data based on extreme seasons (seasons with temperatures greater or less than two standar

Solution 1:

Here's an example for selecting a normal spring followed by a warm summer (just using 1 std dev, not 2, for this example).

>>> seasdif[ (abs(seasdif) < seasdif.std()) &                     # within 1 std dev
             (seasdif.index.get_level_values('Season') == '1') &  # spring 
             (seasdif.shift(-1) > seasdif.std()) ]                # following summer

Year  Season
2036  1         0.064691
2038  1        -0.016453
2047  1         0.020691
2053  1         0.063338
2055  1        -0.045606
Name: A, dtype: float64

My random data is different than yours, so here are my values for 2036 and the std dev below that so that you can verify what the code is doing.

>>> seasdif.loc[2036]

Season
1    0.064691
2    0.165824
3   -0.043372
4    0.086788
Name: A, dtype: float64

>>> seasdif.std()

0.09357005962032763

Solution 2:

The following code creates a dataframe that has your year, season, temperature, two flag columns for unusually hot and cold weather this season, and two flag columns for unusually hot and cold weather last season.

First, duplicate your dataframe, and add flags for unusual weather to the new dataframe:

seasdif2 = pd.DataFrame(seasdif)
warm = []
cold = []
for season in seasdif:
    if season > seasdif.std() * 2:
        warm.append(1)
    else:
        warm.append(0)

for season in seasdif:
    if season < (-(seasdif.std()*2)):
        cold.append(1)
    else:
        cold.append(0)

seasdif2['cold']=cold
seasdif2['warm']=warm

Then, drop your temperature column 'A', so that you have a "flags only" dataframe:

seasdif2 = seasdif2.drop('A',1)

Now, concatenate your flags to your original temperature dataframe. By shifting the index of the flags as you concatenate, you can flag whether the unusual weather happened last season as opposed to this season.

In this case, seasdif2 adds flag columns for unusually warm and cold weather this season, while seasdif2.shift(-1) adds columns for unusually warm and cold weather the previous season:

flagged_seasons = pd.concat([seasdif, seasdif2, seasdif2.shift(-1), seasdif2.shift(1)], axis=1)

Be careful when doing this, however, as you'll end up with multiple "warm" and "cold" flag columns. Make sure you rename the columns added by shift(-1) something like "cold_previous" and "warm_previous" respectively.

Now you can select rows where unusual weather occurred in two consecutive seasons. If you wanted to find whether a hot season is followed by cold season, you would just select dataframe rows where warm==1 and cold_previous==1, for example.


Post a Comment for "Using Pandas To Select Specific Seasons From A Dataframe Whose Values Are Over A Defined Threshold"