PIG:FLATTEN错误

尼雅斯

X与结构的关系X: {group: chararray,inboundCount: {(name: chararray,inb: long)},outboundCount: {(name: chararray,out: long)}}如下:

(IAD,{},{(IAD,25)})
(LAX,{},{(LAX,2)})
(ORD,{(ORD,27)},{})
(PDX,{},{(PDX,3)}) 
(SFO,{(SFO,3)},{})

我想要具有以下结构的输出,并且没有输出final: {airport: chararray,inbound: long,outbound: long}

(IAD,,25)
(LAX,,2)
(ORD,27,)
(PDX,,3)
(SFO,3,)

我尝试了以下代码,它给出了我想要的输出结构。但是什么也没印出来。是因为零值袋吗?

final = foreach X generate group as airport,FLATTEN(inboundCount.inb) as inbound,FLATTEN(outboundCount.out) as outbound;

请帮我。

编辑x通过执行以下命令得到了这种关系

A= load '/user/hduser/airline.csv' using PigStorage(',') as (year:int,month:int,dayofmonth:int,dayofweek:int,dep:int,CRS:int,Arr:int,CRSArr:int,UniqueCarrier:chararray,FlightNum:int,TailNum:chararray,ActualElapsedTime:int,CRSElapsed:int,AirTime:int,ArrDelay:int,DepDelay:int,Origin:chararray,Dest:chararray,Distance:int,TaxiIn:int,TaxiOut:int,Cancelled:int,CancelCode:chararray,Diverted:int,CarrierDelay:int,WeatherDelay:int,NASDelay:int,SecurityDelay:int,LateAircraft:int);
B= foreach A generate year,month,UniqueCarrier,FlightNum,TailNum,Origin,Dest;
inbound = group B by Dest;
inboundCount = foreach inbound generate group,COUNT(B.FlightNum) as inb;
outbound = group B by Origin;
outboundCount = foreach outbound generate group,COUNT(B.FlightNum) as out;
X = COGROUP inboundCount BY name, outboundCount BY name;

输入记录样本:

2008,1,31,4,1757,1155,2400,1758,UA,114,N845UA,243,243,217,362,362,LAX,ORD,1745,11,15,0 ,, 0,0,0,362,0,0

投降王

您快要准备好了,请尝试此操作。只需应用SUM而不是将其展平

 A= load '/user/hduser/airline.csv' using PigStorage(',') as (year:int,month:int,dayofmonth:int,dayofweek:int,dep:int,CRS:int,Arr:int,CRSArr:int,UniqueCarrier:chararray,FlightNum:int,TailNum:chararray,ActualElapsedTime:int,CRSElapsed:int,AirTime:int,ArrDelay:int,DepDelay:int,Origin:chararray,Dest:chararray,Distance:int,TaxiIn:int,TaxiOut:int,Cancelled:int,CancelCode:chararray,Diverted:int,CarrierDelay:int,WeatherDelay:int,NASDelay:int,SecurityDelay:int,LateAircraft:int);

B= foreach A generate year,month,UniqueCarrier,FlightNum,TailNum,Origin,Dest;

inbound = group B by Dest;

inboundCount = foreach inbound generate group,COUNT(B.FlightNum) as inb;

outbound = group B by Origin;

outboundCount = foreach outbound generate group,COUNT(B.FlightNum) as out;

X = COGROUP inboundCount BY name, outboundCount BY name;

final_data = FOREACH X GENERATE group as airport, SUM(inboundCount.inb) as inb, SUM(outboundCount.out) as out;

dump final_data;

final_data的转储将为您提供预期的结果。

(IAD,,25)
(LAX,,2)
(ORD,27,)
(PDX,,3)
(SFO,3,)

如果需要,您仍然可以将NULL计数替换为0

 final_null_check = FOREACH final_data GENERATE airport,(inb is null ? 0 :inb) as inb_cnt, (out is null ? 0 : out) as out_cnt;

NULL之后,检查是否转储final_null_check关系,您将获得如下输出

 (IAD,0,25)
 (LAX,0,2)
 (ORD,27,0)
 (PDX,0,3)
 (SFO,3,0)

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

在Pig中将Flatten和Strsplit的输出转换

来自分类Dev

FLATTEN 或 SUBSTRING 与 Pig 中的 JOIN 混淆

来自分类Dev

错误 1070 PIG 至今

来自分类Dev

Apache Pig错误-无法跟踪

来自分类Dev

PIG设置抛出错误

来自分类Dev

Pig中ToDate函数的错误

来自分类Dev

Pig和Hadoop连接错误

来自分类Dev

Pig ToDate投放错误

来自分类Dev

Apache Pig-说明命令错误

来自分类Dev

Pig 0.13错误仅在mapreduce模式下

来自分类Dev

本地计算机上的Pig错误

来自分类Dev

PIG错误-无法从本地文件系统读取

来自分类Dev

如何解决以下Apache Pig错误?

来自分类Dev

Apache Pig-说明命令错误

来自分类Dev

本地计算机上的Pig错误

来自分类Dev

DUMP在Pig中出现奇怪的错误

来自分类Dev

错误1070 Apache Pig,使用内置UDF

来自分类Dev

如何解决以下Apache Pig错误?

来自分类Dev

我的Pig Latin脚本中的错误

来自分类Dev

错误org.apache.pig.tools.grunt.Grunt-错误1000

来自分类Dev

Pig错误:无法找到或加载主类org.apache.pig.Main

来自分类Dev

Pig中的“ RM中不存在”后端错误

来自分类Dev

Pig错误1066:无法打开迭代器进行别名测试

来自分类Dev

在Amazon EMR上运行Pig Word Count脚本出现错误

来自分类Dev

CASSANDRA + PIG + CQL +计数器列出现错误

来自分类Dev

使用HCatLoader进行PIG,Java堆空间错误

来自分类Dev

apache-Pig map-reduce错误分组

来自分类Dev

Apache Pig浮点数SUM错误的精度

来自分类Dev

语法错误,“ FLATTEN”处或附近出现意外符号