我有一个如下表:
ID | col1 | col2 | col3 | 第4列 |
---|---|---|---|---|
一种 | 100 | 400 | 30 | 800 |
乙 | 600 | 50 | 500 | 75 |
我想要一个查询,我可以返回类似
ID | col1 | col2 | col3 | 第4列 |
---|---|---|---|---|
一种 | 0 | 1个 | 0 | 1个 |
乙 | 1个 | 0 | 1个 | 0 |
除此之外,我希望逻辑查看每一行,并为每一行查找哪两列的前2个值最高。我想可能会涉及一些CTE或子查询。即使是会产生以下结果的CTE也足够好,但不知道如何达到此CTE:
ID | top_2_col_name |
---|---|
一种 | col2 |
一种 | 第4列 |
乙 | col1 |
乙 | col3 |
有没有一种方法可以按行而不是按列进行聚合和窗口函数?我正在使用Google的BigQuery SQL。
如果需要前两个值,则一种方法是取消嵌套这些值并计算排名并选择它们:
with t as (
select 'A' as id, 100 as col1, 400 as col2, 30 as col3, 800 as col4 union all
select 'B' as id, 600 as col1, 50 as col2, 500 as col3, 75 as col4
)
select * except (seqnum)
from (select t.id, col.*, row_number() over (partition by t.id order by col.val desc) as seqnum
from t cross join
unnest(array[struct('col1' as col, t.col1 as val),
struct('col2', t.col2),
struct('col3', t.col3),
struct('col4', t.col4)
]
) col
) tc
where seqnum <= 2;
这是结果集的第二种形式。
您可以使用JSON技巧将其概括为任意数量的列。这将产生一个字符串,然后为您关心的列解析该字符串,取消嵌套它们并执行类似的操作:
with t as (
select 'A' as id, 100 as col1, 400 as col2, 30 as col3, 800 as col4 union all
select 'B' as id, 600 as col1, 50 as col2, 500 as col3, 75 as col4
)
select t.id, concat('col', n), val
from (select t.id, val, n, row_number() over (partition by t.id order by val desc) as seqnum
from t cross join
unnest(regexp_extract_all(to_json_string(t), '"col[0-9]+":([0-9]+)')) val with offset n
) t
where seqnum <= 2;
这可以在任意数量的列上使用。当然,如果您具有这样的数据结构,则值应确实存储在数组中。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句