I can't seem to convert a series containing date strings into a dtype
of datetime64
. The following code reproduces the error:
import pandas as pd
gud_date_s = pd.Series(["2019/12/31 00:00:00.0"]*100)
gud_date_s2 = pd.Series(["2261/12/31 00:00:00.0"]*100)
bad_date_s = pd.Series(["9999/12/31 00:00:00.0"]*100)
bad_date_s2 = pd.Series(["2262/12/31 00:00:00.0"]*100)
gd1 = pd.to_datetime(gud_date_s, format="%Y/%m/%d", yearfirst=True).dt.date # Correct
gd2 = pd.to_datetime(gud_date_s2 , format="%Y/%m/%d", yearfirst=True).dt.date # Correct
bd1 = pd.to_datetime(bad_date_s, format="%Y/%m/%d", yearfirst=True).dt.date
#Returns {ValueError}time data 9999/12/31 00:00:00.0 doesn't match format specified.
bd2 = pd.to_datetime(bad_date_s2 , format="%Y/%m/%d", yearfirst=True).dt.date
#Returns {ValueError}time data 2262/12/31 00:00:00.0 doesn't match format specified.
So the threshold of accepted years seems to be 2261
. Why? How do I fix this?
N.B: dates such as 9999/12/31
are relevant, Therefore I would like to keep them as-is.
Cheers
Here is not valid value year 9999
, so is necessary errors='coerce'
for convert to NaT
:
bd1 = pd.to_datetime(bad_date_s, format="%Y/%m/%d", yearfirst=True, errors='coerce').dt.date
And here is raised error, because limit, year is correct, but maximum month and day is only 11th April
:
Unfortunately here error should be more clear.
bd2 = pd.to_datetime(bad_date_s2 , format="%Y/%m/%d", yearfirst=True, errors='coerce').dt.date
print (pd.Timestamp.max)
2262-04-11 23:47:16.8547758
For working with datetimes it raise error:
from datetime import datetime
d = datetime(year=9999, month=12, day=31)
bd1 = pd.to_datetime(bad_date_s, format="%Y/%m/%d", yearfirst=True, errors='coerce').dt.date.fillna(d)
print (bd1)
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 9999-12-31 00:00:00
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加