正如解释在这里,由于多种原因,即,StackOverflow的问题可以被关闭的...重复,题外话,因为......,需要细节还是清晰度,需要更多的关注和基于意见。
查询是在Google Cloud Platform上的公共StackOverflow Bigquery上执行的。除其他表格外,此Bigquery包含posts_questions和votes,第一个包含所有问题,第二个包含对这些问题的投票。
posts_questions模式:
ID | 标题 | 身体 | Accepted_answer_id | answer_count | comment_count | ... |
---|
投票架构:
ID | 创建日期 | post_id | 投票类型编号 |
---|
存在16个不同的vote_type_id,根据此关于Meta的帖子,vote_type_id 6对应于近距离投票。用户投了三票赞成票后,StackOverflow上出现一个关闭问题。因此,以下查询返回10个已关闭问题的ID和URL。
SELECT q.id, CONCAT('https://stackoverflow.com/questions/', CAST(q.id as STRING)) as url,
FROM `bigquery-public-data.stackoverflow.posts_questions` AS q
JOIN `bigquery-public-data.stackoverflow.votes` AS v
ON q.id = v.post_id
WHERE v.vote_type_id = 6
GROUP BY q.id
HAVING COUNT(*) >= 3
LIMIT 10
我的问题是,是否有可能根据接近投票的原因查询已关闭的问题?例如,查询250个由于重复而被关闭的问题?
像[howto]查询由于重复而关闭的问题?
您应该使用PostHistory表
PostHistoryTypeId
评论:此字段将包含编辑帖子的用户发表的评论。如果PostHistoryTypeId = 10,则此字段包含关闭原因的CloseReasonId
所以,最后-查询是
SELECT q.id, CONCAT('https://stackoverflow.com/questions/', CAST(q.id as STRING)) as url,
FROM `bigquery-public-data.stackoverflow.posts_questions` AS q
JOIN `bigquery-public-data.stackoverflow.post_history` AS h
ON q.id = h.post_id
WHERE h.post_history_type_id = 10
AND h.comment in ('1', '101')
GROUP BY q.id
我的问题是,是否可以根据接近投票的原因查询已关闭的问题?
SELECT CASE
WHEN comment IN ('1', '101') THEN 'Duplicate'
WHEN comment = '102' THEN 'Off-topic'
WHEN comment = '3' THEN 'Subjective and argumentative'
WHEN comment = '4' THEN 'Not a real question'
WHEN comment = '7' THEN 'Too localized'
WHEN comment = '10' THEN 'General reference'
WHEN comment = '20' THEN 'Noise or pointless (Meta sites only)'
WHEN comment = '103' THEN '''Unclear what you're asking'''
WHEN comment = '104' THEN 'Too broad'
WHEN comment = '105' THEN 'Primarily opinion-based'
ELSE 'Unknown'
END close_reason,
COUNT(DISTINCT q.id) cloased_posts
FROM `bigquery-public-data.stackoverflow.posts_questions` AS q
JOIN `bigquery-public-data.stackoverflow.post_history` AS h
ON q.id = h.post_id
WHERE h.post_history_type_id = 10
GROUP BY close_reason
ORDER BY cloased_posts DESC
带输出
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句