Converting Text To Datetime64 In Numpy
I have numpy array of strings (p.s. why is string represented as object?!) t = array(['21/02/2014 08:40:00 AM', '11/02/2014 10:50:00 PM', '07/04/2014 05:50:00 PM', '17/0
Solution 1:
The date format is the problem, 01/01/2015
is ambiguous, if it was in ISO 8601 you could parse it directly using numpy, in your case since you only want the date then splitting and rearranging the data will be significantly faster:
t = np.array([datetime.strptime(d.split(None)[0], "%d/%m/%Y")
for d in t],dtype='datetime64[us]').astype('datetime64[D]')
Some timings, first rearranging after parsing:
In [36]: %%timeit
from datetime import datetime
t = np.array(['21/02/2014 08:40:00', '11/02/2014 10:50:00 PM',
'07/04/2014 05:50:00 PM', '17/02/2014 10:20:00 PM',
'07/03/2014 06:10:00 AM', '02/03/2014 12:25:00 PM',
'05/02/2014 03:20:00 AM', '31/01/2014 12:30:00 AM',
'28/02/2014 01:25:00 PM']*10000)
t1 = np.array([np.datetime64("{}-{}-{}".format(c[:4], b, a)) for a, b, c in (s.split("/", 2) for s in t)])
....:
10 loops, best of 3: 125 ms per loop
Your code:
In [37]: %%timeit
from datetime import datetime
t = np.array(['21/02/2014 08:40:00 AM', '11/02/2014 10:50:00 PM',
'07/04/2014 05:50:00 PM', '17/02/2014 10:20:00 PM',
'07/03/2014 06:10:00 AM', '02/03/2014 12:25:00 PM',
'05/02/2014 03:20:00 AM', '31/01/2014 12:30:00 AM',
'28/02/2014 01:25:00 PM']*10000)
t = [datetime.strptime(tt,"%d/%m/%Y %H:%M:%S %p") for tt in t]
t = np.array(t,dtype='datetime64[us]').astype('datetime64[D]')
....:
1 loops, best of 3: 1.56 s per loop
A dramatic difference with both giving the same result:
In [48]: t = np.array(['21/02/2014 08:40:00 AM', '11/02/2014 10:50:00 PM',
'07/04/2014 05:50:00 PM', '17/02/2014 10:20:00 PM',
'07/03/2014 06:10:00 AM', '02/03/2014 12:25:00 PM',
'05/02/2014 03:20:00 AM', '31/01/2014 12:30:00 AM',
'28/02/2014 01:25:00 PM'] * 10000)
In [49]: t1 = [datetime.strptime(tt,"%d/%m/%Y %H:%M:%S %p") for tt in t]
t1 = np.array(t1,dtype='datetime64[us]').astype('datetime64[D]')
....:
In [50]: t2 = np.array([np.datetime64("{}-{}-{}".format(c[:4], b, a)) for a, b, c in (s.split("/", 2) for s in t)])
In [51]: (t1 == t2).all()
Out[51]: True
Post a Comment for "Converting Text To Datetime64 In Numpy"