Skip to content Skip to sidebar Skip to footer

Drop Rows After Particular Year Pandas

I have a column in my dataframe that has years in the following format: 2018-19 2017-18 The years are object data type. I want to change the type of this column to datetime, then

Solution 1:

I think here is simpliest compare years separately, e.g. before -:

print (BOS)
    Season
01979-8012018-1922017-18


df = BOS[BOS['Season'].str.split('-').str[0].astype(int) < 2017]
print (df)
    Season
01979-80

Details:

First is splited value by Series.str.split to lists and then seelcted first lists:

print (BOS['Season'].str.split('-'))
0    [1979, 80]
1    [2018, 19]
2    [2017, 18]
Name: Season, dtype: objectprint (BOS['Season'].str.split('-').str[0])
019791201822017
Name: Season, dtype: object

Or convert both years to separately columns:

BOS['start'] = pd.to_datetime(BOS['Season'].str.split('-').str[0],  format='%Y').dt.year
BOS['end'] =  BOS['start'] + 1print (BOS)
    Season  start   end
01979-801979198012018-192018201922017-1820172018

Solution 2:

I would have use .str.slice accessor of Series to select the part of the date I wish to keep, to insert it into the pd.to_datetime() function. Then, the select with .loc[] and boolean mask becomes easy.

import pandas as pd 

data = {
    'date' : ['2016-17', '2017-18', '2018-19', '2019-20']
}
df = pd.DataFrame(data)
print(df)
#       date# 0  2016-17# 1  2017-18# 2  2018-19# 3  2019-20df['date'] = pd.to_datetime(df['date'].str.slice(0, 4), format='%Y')
print(df)
#         date# 0 2016-01-01# 1 2017-01-01# 2 2018-01-01# 3 2019-01-01df = df.loc[ df['date'].dt.year < 2018 ]
print(df)
#           date# 0 2016-01-01# 1 2017-01-01

Post a Comment for "Drop Rows After Particular Year Pandas"