表中具有任意间隔(没有日期/时间!)的给定数据定义如下:
START float
END float
VALUE varchar(40)
例如
START END VALUE
----- --- ------
0 1 Banana
1 3 Banana
3 4 Orange
4 7 Orange
7 8 Apple
8 9 Apple
9 10 Apple
10 15 Apple
20 22 Apple
22 23 Apple
23 28 Banana
28 30 Banana
etc..
如何汇总数据,以便在连续间隔内仅列出一个值。即查询结果应类似于:
START END VALUE
----- --- ------
0 3 Banana
3 7 Orange
7 15 Apple
20 23 Apple
23 30 Banana
请注意上面的15和20之间的差距。我正在处理大量数据(约50万行),但不经常运行查询。因此效率是非常好的。不用光标就可以做到吗?
(注意:如果使用SQL2008R2,则不能利用较新的功能,如果存在)
谢谢!
WITH TableWithPreviousAndNext AS (
SELECT CA1.[Previous]
,Table1.[Start]
,Table1.[End]
,CA2.[Next]
,Table1.[Value]
,(1 + ROW_NUMBER() OVER (PARTITION BY [Value] ORDER BY Table1.[Start])) / 2 AS [Group]
FROM Table1
CROSS APPLY (
SELECT MAX([End]) AS [Previous]
FROM Table1 AS InnerTable1
WHERE InnerTable1.[Value] = Table1.[Value]
AND InnerTable1.[Start] < Table1.[Start]
) AS CA1
CROSS APPLY (
SELECT MIN([Start]) AS Next
FROM Table1 AS InnerTable1
WHERE InnerTable1.[Value] = Table1.[Value]
AND InnerTable1.[Start] > Table1.[Start]
) AS CA2
CROSS APPLY ( -- A little trick to create a 2 row group for isolated rows
SELECT 1 AS Dummy
UNION ALL
SELECT 1
WHERE ([Previous] IS NULL OR [Previous] <> [Start])
AND ([Next] IS NULL OR [Next] <> [End])
) AS CA3
WHERE [Previous] IS NULL -- Remove all but first and last in sequence
OR [Next] IS NULL
OR [Previous] <> [Start]
OR [End] <> [Next]
)
SELECT MIN([Start])
,MAX([End])
,[Value]
FROM TableWithPreviousAndNext
GROUP BY [Value]
,[Group]
ORDER BY MIN(Start)
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句