从SQL中许多列名中具有最高值的前2个列名中获取二进制值

debugcn 发表于 Dev

孙凯文

我有一个如下表：

ID	col1	col2	col3	第4列
一种	100	400	30	800
乙	600	50	500	75

我想要一个查询，我可以返回类似

ID	col1	col2	col3	第4列
一种	0	1个	0	1个
乙	1个	0	1个	0

除此之外，我希望逻辑查看每一行，并为每一行查找哪两列的前2个值最高。我想可能会涉及一些CTE或子查询。即使是会产生以下结果的CTE也足够好，但不知道如何达到此CTE：

ID	top_2_col_name
一种	col2
一种	第4列
乙	col1
乙	col3

有没有一种方法可以按行而不是按列进行聚合和窗口函数？我正在使用Google的BigQuery SQL。

戈登·利诺夫

如果需要前两个值，则一种方法是取消嵌套这些值并计算排名并选择它们：

with t as (
      select 'A' as id, 100 as col1, 400 as col2,  30 as col3, 800 as col4 union all
      select 'B' as id, 600 as col1, 50  as col2, 500 as col3, 75 as col4 
     )
select * except (seqnum)
from (select t.id, col.*, row_number() over (partition by t.id order by col.val desc) as seqnum
      from t cross join
           unnest(array[struct('col1' as col, t.col1 as val),
                              struct('col2', t.col2),
                              struct('col3', t.col3),
                              struct('col4', t.col4)
                             ]
                        ) col
     ) tc
where seqnum <= 2;

这是结果集的第二种形式。

您可以使用JSON技巧将其概括为任意数量的列。这将产生一个字符串，然后为您关心的列解析该字符串，取消嵌套它们并执行类似的操作：

with t as (
      select 'A' as id, 100 as col1, 400 as col2,  30 as col3, 800 as col4 union all
      select 'B' as id, 600 as col1, 50  as col2, 500 as col3, 75 as col4 
     )
select t.id, concat('col', n), val
from (select t.id, val, n, row_number() over (partition by t.id order by val desc) as seqnum
      from t cross join
           unnest(regexp_extract_all(to_json_string(t), '"col[0-9]+":([0-9]+)')) val with offset n
     ) t
where seqnum <= 2;

这可以在任意数量的列上使用。当然，如果您具有这样的数据结构，则值应确实存储在数组中。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。