Drop Rows After Particular Year Pandas
I have a column in my dataframe that has years in the following format: 2018-19 2017-18 The years are object data type. I want to change the type of this column to datetime, then
Solution 1:
I think here is simpliest compare years separately, e.g. before -
:
print (BOS)
Season
01979-8012018-1922017-18
df = BOS[BOS['Season'].str.split('-').str[0].astype(int) < 2017]
print (df)
Season
01979-80
Details:
First is splited value by Series.str.split
to lists and then seelcted first lists:
print (BOS['Season'].str.split('-'))
0 [1979, 80]
1 [2018, 19]
2 [2017, 18]
Name: Season, dtype: objectprint (BOS['Season'].str.split('-').str[0])
019791201822017
Name: Season, dtype: object
Or convert both years to separately columns:
BOS['start'] = pd.to_datetime(BOS['Season'].str.split('-').str[0], format='%Y').dt.year
BOS['end'] = BOS['start'] + 1print (BOS)
Season start end
01979-801979198012018-192018201922017-1820172018
Solution 2:
I would have use .str.slice
accessor of Series to select the part of the date I wish to keep, to insert it into the pd.to_datetime()
function. Then, the select with .loc[]
and boolean mask becomes easy.
import pandas as pd
data = {
'date' : ['2016-17', '2017-18', '2018-19', '2019-20']
}
df = pd.DataFrame(data)
print(df)
# date# 0 2016-17# 1 2017-18# 2 2018-19# 3 2019-20df['date'] = pd.to_datetime(df['date'].str.slice(0, 4), format='%Y')
print(df)
# date# 0 2016-01-01# 1 2017-01-01# 2 2018-01-01# 3 2019-01-01df = df.loc[ df['date'].dt.year < 2018 ]
print(df)
# date# 0 2016-01-01# 1 2017-01-01
Post a Comment for "Drop Rows After Particular Year Pandas"