我有以下数据:
ItemID | ProdID | ItemTypeID
=======+========+===========
1001 | 100 | A
1002 | 100 | B
1003 | 100 | A
1004 | 100 | B
1005 | 100 | B <- successive itemtype (should be excluded)
1006 | 100 | C
1007 | 200 | C
1008 | 200 | A
我想列出每个组中的ItemTypeID(LISTAGG),但没有相同的连续ID。所以我的结果应该是这样的:
ProdID | ItemTypes
=======+==========
100 | A,B,A,B,C <- not A,B,A,B,B,C (successive B)
200 | C,A
这很棘手。Listagg()
甚至不允许distinct
,因此所有工作都需要在子查询中完成。
您可以ItemTypeId
使用行号差异法来识别连续的。然后,一旦确定了组,就可以按组进行汇总,然后执行以下操作listagg()
:
select ProdId,
listagg(ItemId, ',') within group (order by seqnum) as items
from (select ProdId, ItemId, count(*) as NumItems,
row_number() over (partition by ProdId order by min(ItemId)) as seqnum
from (select t.*,
(row_number() over (partition by ProdId order by ItemId) -
row_number() over (partition by ProdId, ItemTypeId order by ItemId)
) as grp
from t
) t
group by ProdId, ItemTypeId, grp
) t
group by ProdId;
编辑:
解决此问题的另一种方法lag()
是查找新组的起点。前一种方法可让您获得计数。此方法可能更容易理解:
select ProdId,
listagg(ItemId, ',') within group (order by ItemId) as items
from (select t.*
from (select t.*,
lag(ItemTypeId) over (partition by ProdId order by ItemId) as prev_ItemTypeId
from t
) t
where prev_ItemTypeId is null or prev_ItemTypeId <> ItemTypeId
) t
group by ProdId;
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句