我以以下方式获取字符串,日期以随机模式结尾。但是它将仅包含下划线,斜杠,数字或连字符。
TRAVEL_DELAY_01072015
TRAVEL_DELAY_01_07_2015
TRAVEL_DELAY_2015/01/04
TRAVEL_DELAY_2015-01-04
我只需要从上述字符串中取出TRAVEL_DELAY。我为此使用正则表达式,但无法正常工作:
m = re.match("^(.*)[_0-9\/.]+", abovestring)
如果您只想分割日期:
s="""TRAVEL_DELAY_01072015
TRAVEL_DELAY_01_07_2015
TRAVEL_DELAY_2015/01/04
TRAVEL_DELAY_2015-01-04"""
for line in s.splitlines():
date = line.split("_",2)[-1]
01072015
01_07_2015
2015/01/04
2015-01-04
或者str.replace
,不需要正则表达式:
for line in s.splitlines():
date = line.replace("TRAVEL_DELAY_","")
print(date)
01072015
01_07_2015
2015/01/04
2015-01-04
如果您实际上是想解析日期,则可以使用dateutil
并修复字符串:
from dateutil import parser
for line in s.splitlines():
date = line.replace("TRAVEL_DELAY_","")
if any(ch in date for ch in ("/","-","_")):
print(parser.parse(date.replace("_","-")))
else:
date = "{}-{}-{}".format(date[:2],date[2:4],date[4:])
print(parser.parse(date))
2015-01-07 00:00:00
2015-01-07 00:00:00
2015-01-04 00:00:00
2015-01-04 00:00:00
如果数字仅在日期中,并且您实际上想要的是字符串而不是日期:
s="""TRAVEL_DELAY_01072015
TRAVEL_DELAY_01_07_2015
TRAVEL_DELAY_2015/01/04
Travel_Delay_Data_2015/01/04
TRAVEL_DELAY_2015-01-04"""
for line in s.splitlines():
ind = next(ind for ind, ele in enumerate(line) if ele.isdigit())
s = line[:ind-1]
print(s)
TRAVEL_DELAY
TRAVEL_DELAY
TRAVEL_DELAY
Travel_Delay_Data
TRAVEL_DELAY
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句