这个问题的核心涉及当group_by信息来自与汇总单位不同的data.frame时,使用dplyr。示例:我已将位置分配给其他地方的组。位置组到组的每个唯一分配都是一个计划。有成千上万的计划。我正在寻找有关每个计划的摘要统计信息。
我在一个缓慢的嵌套for循环中执行此操作,并希望尽可能加快该过程。我希望我可以使用group_by和summarise来做到这一点,但是语法使我难以捉摸,我发现的示例都具有来自相同小标题或data.frame的查找。可复制的示例:
# locations (x,y), populations at those locations (popA, popB)
df <- data.frame(x = rep(1:3, times = 3),
y = c(1,1,1,2,2,2,3,3,3),
popA = c(1,2,3,4,5,6,7,8,9),
popB = c(10,11,12,13,14,15,16,17,18))
# plans (Runs 1 through 3) each plan is a column in the data.frame and the
# value indicates the group to which each location was assigned in that plan
result < -data.frame(Run1 = c(1,1,1,2,2,2,3,3,3),
Run2 = c(1,2,3,1,2,3,1,2,3),
Run3 = c(1,1,3,2,2,3,3,3,3))
#The data.frame where I will store my summary statistics.
#Plan | District | Pop A | Pop B | Total
pop.by.dist <- data.frame(Plan = rep(NA,(max(result$Run1))*length(colnames(result))),
District = NA, PopA = NA, PopB = NA, Total = NA)
counter = 1
for(i in 1:length(colnames(result))){ #for every plan
for(j in 1:max(result)){ #for every district
tmp <- colSums(df[result[,i]==j,c("popA","popB")])
pop.by.dist[counter,] <- c(colnames(result)[i],j,tmp,sum(tmp))
counter <- counter+1
}
}
pop.by.dist #output has one row per plan * district combination
#> pop.by.dist
# Plan District PopA PopB Total
#1 Run1 1 6 33 39
#2 Run1 2 15 42 57
#3 Run1 3 24 51 75
#4 Run2 1 12 39 51
#5 Run2 2 15 42 57
#6 Run2 3 18 45 63
#7 Run3 1 3 21 24
#8 Run3 2 9 27 36
#9 Run3 3 33 78 111
我知道这里已经有大量相关问题,但是要从另一个数据中查找的具体需求使我很难定位。我不是新用户,并且花了一些时间寻找可以使我工作的答复,因此在将我标记为重复之前,请仅提供代码以解决我的问题。您可能会帮助下一个人。
如果仅绑定两个数据帧没有问题,请首先执行以下操作:
new_df <- cbind(df, result)
然后将数据带入长格式,然后分组,然后汇总:
new_df %>% pivot_longer(c(Run1, Run2, Run3),
names_to = "Plan",
values_to = "District") %>%
group_by(Plan, District) %>%
summarise_at(vars(popA, popB), sum) %>%
mutate(Total = popA + popB)
# A tibble: 9 x 5
# Groups: Plan [3]
Plan District popA popB Total
<chr> <dbl> <dbl> <dbl> <dbl>
1 Run1 1 6 33 39
2 Run1 2 15 42 57
3 Run1 3 24 51 75
4 Run2 1 12 39 51
5 Run2 2 15 42 57
6 Run2 3 18 45 63
7 Run3 1 3 21 24
8 Run3 2 9 27 36
9 Run3 3 33 78 111
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句