我正在尝试学习 R,我决定通过构建一个东西来阅读我所在州在选举之夜发布的实时选举结果来解决这个问题。不幸的是,我在计算Margin
用于地图填充的值时遇到了障碍。我所在的州 (WA) 使用 Top 2 初选,这意味着在某些种族中,在 11 月的选举中有两个同党的人。这可能是太多的背景,但无论如何这里是编码问题:
我有一个如下所示的数据框:
Dist Party Votes
1 (Prefers Democratic Party) 124151
1 (Prefers Republican Party) 101428
2 (Prefers Democratic Party) 122173
2 (Prefers Republican Party) 79518
3 (Prefers Republican Party) 124796
3 (Prefers Democratic Party) 78018
4 (Prefers Republican Party) 75307
4 (Prefers Republican Party) 77772
5 (Prefers Republican Party) 135470
5 (Prefers Democratic Party) 87772
6 (Prefers Democratic Party) 141265
6 (Prefers Republican Party) 83025
7 (Prefers Democratic Party) 203954
7 (Prefers Republican Party) 47921
8 (Prefers Republican Party) 125741
8 (Prefers Democratic Party) 73003
9 (Prefers Democratic Party) 118132
9 (Prefers Republican Party) 48662
10 (Prefers Democratic Party) 99279
10 (Prefers Republican Party) 82213
我想让它看起来像这样:
Dist (Prefers Democratic Party) (Prefers Republican Party)
1 124151 101428
2 122173 79518
3 78018 124796
4 [NA or 0] 153079
5 87772 135470
6 141265 83025
7 203954 47921
8 73003 125741
9 118132 48662
10 99279 82213
spread()
不起作用,因为Dist = 4
. 我已经设法将这里的其他一些问题放在一起,但我对此并不满意,而且我几乎肯定有更好的方法
library(tidyr)
library(dplyr)
CongressTidy %>%
group_by(Dist) %>%
mutate(GOPVotes = sum(ifelse(Party == "(Prefers Republican Party)", Votes, 0))) %>%
mutate(DemVotes = sum(ifelse(Party == "(Prefers Democratic Party)", Votes, 0)))
返回这个:
Dist Party Votes GOPVotes DemVotes
<fctr> <fctr> <int> <dbl> <dbl>
1 (Prefers Democratic Party) 124151 101428 124151
1 (Prefers Republican Party) 101428 101428 124151
2 (Prefers Democratic Party) 122173 79518 122173
2 (Prefers Republican Party) 79518 79518 122173
3 (Prefers Republican Party) 124796 124796 78018
3 (Prefers Democratic Party) 78018 124796 78018
4 (Prefers Republican Party) 75307 153079 0
4 (Prefers Republican Party) 77772 153079 0
5 (Prefers Republican Party) 135470 135470 87772
5 (Prefers Democratic Party) 87772 135470 87772
6 (Prefers Democratic Party) 141265 83025 141265
6 (Prefers Republican Party) 83025 83025 141265
7 (Prefers Democratic Party) 203954 47921 203954
7 (Prefers Republican Party) 47921 47921 203954
8 (Prefers Republican Party) 125741 125741 73003
8 (Prefers Democratic Party) 73003 125741 73003
9 (Prefers Democratic Party) 118132 48662 118132
9 (Prefers Republican Party) 48662 48662 118132
10 (Prefers Democratic Party) 99279 82213 99279
10 (Prefers Republican Party) 82213 82213 99279
就目前而言,这很好,我可以添加选择器列并通过它进行选择:
CongressMargins <- CongressTidy %>%
group_by(Dist) %>%
mutate(GOPVotes = sum(ifelse(Party == "(Prefers Republican Party)", Votes, 0))) %>%
mutate(DemVotes = sum(ifelse(Party == "(Prefers Democratic Party)", Votes, 0))) %>%
mutate(selector = c(1,2)) %>%
subset(selector == 1, select = c(Dist, GOPVotes, DemVotes))
这给了我想要的东西,我可以从那里很好地计算保证金:
Dist GOPVotes DemVotes
<fctr> <dbl> <dbl>
1 101428 124151
2 79518 122173
3 124796 78018
4 153079 0
5 135470 87772
6 83025 141265
7 47921 203954
8 125741 73003
9 48662 118132
10 82213 99279
但是如果有 2 个无人反对的种族会被搞砸,因为它基于矢量回收。它只是丑陋。而且一定有更好的方法。有任何想法吗?
我们可以先计算组总和,然后再展开。如果您希望丢失的单元格为 0,请使用spread(Party, Votes, fill = 0)
.
library(tidyverse)
dat2 <- dat %>%
group_by(Dist, Party) %>%
summarise(Votes = sum(Votes)) %>%
spread(Party, Votes) %>%
ungroup()
dat2
# # A tibble: 10 x 3
# Dist `(Prefers Democratic Party)` `(Prefers Republican Party)`
# <int> <int> <int>
# 1 1 124151 101428
# 2 2 122173 79518
# 3 3 78018 124796
# 4 4 NA 153079
# 5 5 87772 135470
# 6 6 141265 83025
# 7 7 203954 47921
# 8 8 73003 125741
# 9 9 118132 48662
# 10 10 99279 82213
数据
dat <- read.table(text = "Dist Party Votes
1 '(Prefers Democratic Party)' 124151
1 '(Prefers Republican Party)' 101428
2 '(Prefers Democratic Party)' 122173
2 '(Prefers Republican Party)' 79518
3 '(Prefers Republican Party)' 124796
3 '(Prefers Democratic Party)' 78018
4 '(Prefers Republican Party)' 75307
4 '(Prefers Republican Party)' 77772
5 '(Prefers Republican Party)' 135470
5 '(Prefers Democratic Party)' 87772
6 '(Prefers Democratic Party)' 141265
6 '(Prefers Republican Party)' 83025
7 '(Prefers Democratic Party)' 203954
7 '(Prefers Republican Party)' 47921
8 '(Prefers Republican Party)' 125741
8 '(Prefers Democratic Party)' 73003
9 '(Prefers Democratic Party)' 118132
9 '(Prefers Republican Party)' 48662
10 '(Prefers Democratic Party)' 99279
10 '(Prefers Republican Party)' 82213",
header = TRUE, stringsAsFactors = FALSE)
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句