我有一个数据框(DF),例如:
x y
1 " Accession of China" 0.401
2 " Afghanistan" 0.486
3 " Albania" 0.581
4 " Algeria" 0.431
5 " Andean Community" 0.341
6 " Andorra" 0.378
它有一个国家(x)列表和一个与每个国家相关联的值(y),我需要计算所有可能的国家组合,每个组合中两个值的平均值
例:
x y
1 "Accession of China - Afghanistan" (0.401 + 0.486)/2
2 "Accession of China - Albania" (0.401 + 0.581)/2
应该对所有可能的组合都进行此操作,而不能重复组合。我面临的挑战是找到一种使用tidyverse的方法
非常感谢 :)
您可以使用combn
:
library(dplyr) #dplyr > 1.0.0
result <- DF %>%
summarise(x = combn(x, 2, paste0, collapse = '-'),
y = combn(y, 2, mean))
result
# x y
#1 Accession of China- Afghanistan 0.4435
#2 Accession of China- Albania 0.4910
#3 Accession of China- Algeria 0.4160
#4 Accession of China- Andean Community 0.3710
#5 Accession of China- Andorra 0.3895
#6 Afghanistan- Albania 0.5335
#7 Afghanistan- Algeria 0.4585
#8 Afghanistan- Andean Community 0.4135
#9 Afghanistan- Andorra 0.4320
#10 Albania- Algeria 0.5060
#11 Albania- Andean Community 0.4610
#12 Albania- Andorra 0.4795
#13 Algeria- Andean Community 0.3860
#14 Algeria- Andorra 0.4045
#15 Andean Community- Andorra 0.3595
这也可以使用基数R来完成:
result <- data.frame(x = combn(DF$x, 2, paste0, collapse = '-'),
y = combn(DF$y, 2, mean))
数据
DF <- structure(list(x = c(" Accession of China", " Afghanistan", " Albania",
" Algeria", " Andean Community", " Andorra"), y = c(0.401, 0.486,
0.581, 0.431, 0.341, 0.378)), class = "data.frame", row.names = c(NA, -6L))
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句