Pig中ToDate函数的错误

用户名

我的输入中包含日期时间数据,并希望从Pig中正确加载它。我在Google上搜索并了解到建议将其加载为chararray,然后使用ToDate函数隐式转换为日期时间。但是,相同的脚本适用于一个输入,但不适用于具有相同数据格式的另一个输入。我的猪版本是0.12.1。我正在使用的脚本:

A = load '/user/ss/debug/debug' using PigStorage(',') as (AUDIT:chararray,JOB:chararray,TYPE:chararray,ID:long,STATUS_ID:long,POOL_NAME:chararray,SLA_PRIORITY:long,STATUS:chararray,RUN_ID:long,TASK:chararray,SCENARIO_ID:long,CREDIT_CNT:long,COMM_CNT:long,BONUS_CNT:long,PAYMENT_CNT:long,RUN_TIME:long,START_TIME:chararray,END_TIME:chararray,ITEM_COUNT:long); 

B = foreach A generate JOB, TYPE, ID, CREDIT_CNT, COMM_CNT, BONUS_CNT, PAYMENT_CNT, ToDate(START_TIME, 'yyyy-MM-dd HH:mm:ss') as (START_TIME_DT:datetime), ToDate(END_TIME, 'yyyy-MM-dd HH:mm:ss') as (END_TIME_DT:datetime), START_TIME, END_TIME, ITEM_COUNT; 

dump B;

数据如下所示:

报告错误的输入:

D789FD70FE9E3ABBE0432165880A09E1,D789FD70FE9D3ABBE0432165880A09E1,VA,123,4946586,DEFAULT,1,Completed,,DD13,,0,0,0,0,0,2013-03-10 02:41:14,2013-03-10 02:41:16,0

输入正确运行:

C888E618A7740A71E0432165880ABCA3,C888E618A7730A71E0432165880ABCA3,VA,123,4680120,DEFAULT,1,Completed,,DD12,,0,0,0,0,0,2012-08-31 04:16:56,2012-08-31 04:17:02,0
C888FC5DA4B212F3E0432165880A3C34,C888FC5DA4B112F3E0432165880A3C34,VA,123,4680125,DEFAULT,1,Completed,,DD12,,0,0,0,0,0,2012-08-31 04:17:51,2012-08-31 04:17:57,0
C888FC5DA4B912F3E0432165880A3C34,C888FC5DA4B812F3E0432165880A3C34,VA,123,4680127,DEFAULT,1,Completed,,DD14,,0,0,0,0,0,2012-08-31 04:18:17,2012-08-31 04:18:22,0

我不明白为什么相同的输入模式和脚本会产生不同的结果。错误显示“无法解析“ 2013-03-10 02:41:14”:由于时区偏移转换(America / Los_Angeles)而导致的非法时刻”。

错误日志如下所示:

Backend error message
---------------------
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POUserFunc (Name: POUserFunc(org.apache.pig.builtin.ToDate2ARGS)[datetime] - scope-120 Operator Key: scope-120) children: null at []]: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:338)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:707)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:352)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles)
	at org.joda.time.format.DateTimeParserBucket.computeMillis(DateTimeParserBucket.java:336)
	at org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:672)
	at org.apache.pig.builtin.ToDate2ARGS.exec(ToDate2ARGS.java:45)
	at org.apache.pig.builtin.ToDate2ARGS.exec(ToDate2ARGS.java:33)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextDateTime(POUserFunc.java:422)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:329)
	... 13 more

Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias C. Backend error : Exception while executing [POUserFunc (Name: POUserFunc(org.apache.pig.builtin.ToDate2ARGS)[datetime] - scope-120 Operator Key: scope-120) children: null at []]: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles)

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias C. Backend error : Exception while executing [POUserFunc (Name: POUserFunc(org.apache.pig.builtin.ToDate2ARGS)[datetime] - scope-120 Operator Key: scope-120) children: null at []]: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles)
	at org.apache.pig.PigServer.openIterator(PigServer.java:870)
	at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774)
	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
	at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
	at org.apache.pig.Main.run(Main.java:541)
	at org.apache.pig.Main.main(Main.java:156)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POUserFunc (Name: POUserFunc(org.apache.pig.builtin.ToDate2ARGS)[datetime] - scope-120 Operator Key: scope-120) children: null at []]: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:338)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:707)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:352)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles)
	at org.joda.time.format.DateTimeParserBucket.computeMillis(DateTimeParserBucket.java:336)
	at org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:672)
	at org.apache.pig.builtin.ToDate2ARGS.exec(ToDate2ARGS.java:45)
	at org.apache.pig.builtin.ToDate2ARGS.exec(ToDate2ARGS.java:33)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextDateTime(POUserFunc.java:422)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:329)

任何帮助或建议,将不胜感激。非常感谢!

锡瓦萨克蒂·贾亚拉曼

看起来datetime"2013-03-10 02:41:14"'America/Los_Angeles'时区中不存在这可能是由于美国的夏令时造成的。相同的输入在我的时区工作正常,因此要解决此问题,您需要将时区指定'America/Los_Angeles'为函数中的第三个参数ToDate

您可以像这样更改ToDate函数吗?

ToDate(START_TIME, 'yyyy-MM-dd HH:mm:ss','America/Los_Angeles') 

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章