Configuring The Date Parser When Using Pandas.periodindex
I have a list of dates in this format: >>> dates = ['01/01/2000', '02/01/2000', '25/01/2000', '01/01/3005'] I'd like to create a pandas.PeriodIndex from these dates. Note
Solution 1:
Use to_datetime
with format
string to create a DatetimeIndex
this has the method to to_period
to convert to a PeriodIndex
for you:
In [63]:
dates = ["01/01/2000", "02/01/2000", "25/01/2000"]
pd.to_datetime(dates, format='%d/%m/%Y').to_period(freq='D')
Out[63]:
PeriodIndex(['2000-01-01', '2000-01-02', '2000-01-25'], dtype='period[D]', freq='D')
You can also just pass dayFirst=True
:
In [64]:
dates = ["01/01/2000", "02/01/2000", "25/01/2000"]
pd.to_datetime(dates, dayfirst=True).to_period(freq='D')
Out[64]:
PeriodIndex(['2000-01-01', '2000-01-02', '2000-01-25'], dtype='period[D]', freq='D')
Update
for invalid dates you can split the string dates and convert to int
and then pass these as args to PeriodIndex
ctor:
In [67]:
df = pd.DataFrame({'dates':dates})
df
Out[67]:
dates
0 01/01/2000
1 02/01/2000
2 25/01/2000
3 01/01/3005
In [72]:
df[['day','month','year']] = df['dates'].str.split('/', expand=True).astype(int)
df
Out[72]:
dates day month year
0 01/01/2000 1 1 2000
1 02/01/2000 2 1 2000
2 25/01/2000 25 1 2000
3 01/01/3005 1 1 3005
In [75]:
df['period'] = pd.PeriodIndex(day = df['day'], month=df['month'], year = df['year'], freq='D')
df
Out[75]:
dates day month year period
0 01/01/2000 1 1 2000 2000-01-01
1 02/01/2000 2 1 2000 2000-01-02
2 25/01/2000 25 1 2000 2000-01-25
3 01/01/3005 1 1 3005 3005-01-01
You can see that this produces the desired result:
In [77]:
pd.PeriodIndex(day= df['day'], month=df['month'], year= df['year'], freq='D')
Out[77]:
PeriodIndex(['2000-01-01', '2000-01-02', '2000-01-25', '3005-01-01'], dtype='period[D]', freq='D')
Post a Comment for "Configuring The Date Parser When Using Pandas.periodindex"