Skip to content Skip to sidebar Skip to footer

Configuring The Date Parser When Using Pandas.periodindex

I have a list of dates in this format: >>> dates = ['01/01/2000', '02/01/2000', '25/01/2000', '01/01/3005'] I'd like to create a pandas.PeriodIndex from these dates. Note

Solution 1:

Use to_datetime with format string to create a DatetimeIndex this has the method to to_period to convert to a PeriodIndex for you:

In [63]:
dates = ["01/01/2000", "02/01/2000", "25/01/2000"]
pd.to_datetime(dates, format='%d/%m/%Y').to_period(freq='D')

Out[63]:
PeriodIndex(['2000-01-01', '2000-01-02', '2000-01-25'], dtype='period[D]', freq='D')

You can also just pass dayFirst=True:

In [64]:
dates = ["01/01/2000", "02/01/2000", "25/01/2000"]
pd.to_datetime(dates, dayfirst=True).to_period(freq='D')

Out[64]:
PeriodIndex(['2000-01-01', '2000-01-02', '2000-01-25'], dtype='period[D]', freq='D')

Update

for invalid dates you can split the string dates and convert to int and then pass these as args to PeriodIndex ctor:

In [67]:
df = pd.DataFrame({'dates':dates})
df

Out[67]:
        dates
0  01/01/2000
1  02/01/2000
2  25/01/2000
3  01/01/3005

In [72]:
df[['day','month','year']] = df['dates'].str.split('/', expand=True).astype(int)
df

Out[72]:
        dates  day  month  year
0  01/01/2000    1      1  2000
1  02/01/2000    2      1  2000
2  25/01/2000   25      1  2000
3  01/01/3005    1      1  3005


In [75]:
df['period'] = pd.PeriodIndex(day = df['day'], month=df['month'], year = df['year'], freq='D')
df

Out[75]:
        dates  day  month  year     period
0  01/01/2000    1      1  2000 2000-01-01
1  02/01/2000    2      1  2000 2000-01-02
2  25/01/2000   25      1  2000 2000-01-25
3  01/01/3005    1      1  3005 3005-01-01

You can see that this produces the desired result:

In [77]:
pd.PeriodIndex(day= df['day'], month=df['month'], year= df['year'], freq='D')

Out[77]:
PeriodIndex(['2000-01-01', '2000-01-02', '2000-01-25', '3005-01-01'], dtype='period[D]', freq='D')

Post a Comment for "Configuring The Date Parser When Using Pandas.periodindex"