每次我向表中添加一行时,我都想知道与该点相比,该行在表中的排名。使用RANK()窗口函数可以轻松完成此操作。但是,我一直在努力寻找一种方法来发现它与表相比的排名,直到该点被value过滤为止。
举例来说,我想结束这个高度伪造的表:
date | name | animal_bought | num_sloths_bought_before | num_camels_bought_before
------------+---------+---------------+--------------------------+--------------------------
2014-09-01 | Vincent | sloth | 0 | 0
2014-09-01 | Luis | camel | 0 | 0
2014-09-02 | Vincent | sloth | 1 | 0
2014-09-02 | Luis | camel | 0 | 1
2014-09-02 | Kevin | sloth | 0 | 0
2014-09-03 | Vincent | camel | 1 | 0
2014-09-04 | Deo | camel | 0 | 0
2014-09-04 | Vincent | sloth | 2 | 1
2014-09-05 | Luis | camel | 0 | 2
2014-09-05 | Andrew | sloth | 0 | 0
我最初是在寻找是否可以对窗口函数(例如RANK() OVER(PARTITION BY name WHERE animal_bought = 'sloth' ORDER BY date ASC) AS num_sloths_bought_before
)应用过滤器,但这在语法上是不正确的。然后,我尝试添加一个子查询,如下所示:
SELECT
date,
name,
animal_bought,
( SELECT
RANK() OVER(PARTITION BY name ORDER BY date ASC) - 1
FROM this_table
WHERE animal_bought = 'sloth'
) AS num_sloths_bought_before
FROM source_table
但是Redshift抛出了这个错误:
ERROR: This type of correlated subquery pattern is not supported yet
我还尝试过将window函数放在case语句中(引发相同的错误),并在联接查询中计算排名(无法使其工作)。
嗯 我认为此查询无论如何都不会满足您的要求:
SELECT date, name, animal_bought,
(SELECT RANK() OVER(PARTITION BY name ORDER BY date ASC) - 1
FROM this_table
WHERE animal_bought = 'sloth'
) AS num_sloths_bought_before
FROM source_table
有几个原因:
rank()
提示表明this_table
该匹配项中不止一行animal_bought
。否则,您可以使用聚合功能。where
子句匹配,则该值始终为1,因为该where
子句在之前处理rank()
。也许您只想要rank()
没有子查询?
SELECT date, name, animal_bought,
RANK() OVER (PARTITION BY name, animal ORDER BY date ASC) - 1 as NumberBoughtBefore
FROM source_table;
如果您想同时使用这两种动物,请不要使用rank()
,请使用累积总和:
SELECT date, name, animal_bought,
sum(case when animal = 'sloth' then 1 else 0 end) over (partition by name order by date) as SlothsBefore,
sum(case when animal = 'camel' then 1 else 0 end) over (partition by name order by date) as CamelsBefore
FROM source_table;
编辑:
SELECT date, name, animal_bought,
(sum(case when animal = 'sloth' then 1 else 0 end) over (partition by name order by date) -
(case when animal = 'sloth' then 1 else 0 end)
) as SlothsBefore,
(sum(case when animal = 'camel' then 1 else 0 end) over (partition by name order by date) -
(case when animal = 'camel' then 1 else 0 end)
) as CamelsBefore
FROM source_table;
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句