Given the following data frame, which came from importing a messy Excel spreadsheet:
import pandas as pd
df=pd.DataFrame({'A':['a','b','c'],
'dates':['2015-08-31 00:00:00','2015-08-24 00:00:00','8/3/2015, 1/4/16']})
try:
df['dates']=df['dates'].astype('datetime64[ns]')
except:
pass
df
A dates
0 a 2015-08-31 00:00:00
1 b 2015-08-24 00:00:00
2 c 8/3/2015, 1/4/16
I want to split where more than one date exists and take only the first one like this:
A dates
0 a 2015-08-31 00:00:00
1 b 2015-08-24 00:00:00
2 c 8/3/2015
I'm hoping it will convert the result to the same format like this:
A dates
0 a 2015-08-31 00:00:00
1 b 2015-08-24 00:00:00
2 c 2015-08-03 00:00:00
Thanks in advance!
you can use to_datetime()
in conjunction with .str.split()
:
In [215]: pd.to_datetime(df.dates.str.split(',\s*').str[0])
Out[215]:
0 2015-08-31
1 2015-08-24
2 2015-08-03
Name: dates, dtype: datetime64[ns]
or
In [216]: df['dates'] = pd.to_datetime(df.dates.str.split(',\s*').str[0])
In [217]: df
Out[217]:
A dates
0 a 2015-08-31
1 b 2015-08-24
2 c 2015-08-03
dtypes:
In [219]: df.dtypes
Out[219]:
A object
dates datetime64[ns]
dtype: object
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments