使用两个数据框列的平均值生成交叉表

debugcn 发表于 Dev

天使

我有两个数据框，一个叫做“students.short”，通过以下方式生成：

students.short <- data.frame(shoesize=c(38,39,38,38,39,38,37,36),
 population=c("kuopio","kuopio","kuopio","tampere",
 "tampere","tampere","tampere","tampere"))

students.short

  shoesize population
1       38     kuopio
2       39     kuopio
3       38     kuopio
4       38     kuopio
5       39    tampere
6       38    tampere
7       37    tampere
8       36    tampere

另一个叫做“students.tall”：

students.tall <- data.frame(shoesize=c(44,42,43,43,42,44,43,43),
 population=c("kuopio","kuopio","kuopio","kuopio",
 "tampere","tampere","tampere","tampere"))

students.tall

  shoesize population
1       44     kuopio
2       42     kuopio
3       43     kuopio
4       43     kuopio
5       42    tampere
6       44    tampere
7       43    tampere
8       43    tampere

我需要在总体（kuopio 或 tampere）和每个数据帧的鞋码平均值之间创建一个交叉表，例如

                       kuopio   tampere

studenst.short          38.3       37.6

studenst.tall             43         43

我找不到一种干净或简单的方法来做到这一点，有什么想法或任何帮助吗？

三角旗

一口气，使用 data.table

首先，创建 data.tables 的命名列表（使用setDT()）
然后，将列表绑定在一起（使用rbindlist()，使用名称作为 id ( idcol = TRUE)。
最后，dcast宽格式，总结mean的value.var;shoesize

代码

library( data.table )

dcast( rbindlist( list( students.short = setDT( students.short ), 
                        students.tall = setDT( students.tall ) ),
                  idcol = TRUE ),
       .id ~ population, 
       value.var = "shoesize", 
       fun = mean )

#               .id   kuopio tampere
# 1: students.short 38.33333    37.6
# 2:  students.tall 43.00000    43.0

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。