如何找到每个ID的统计模式

debugcn 发表于 Dev

濑户吾郎

这是我的数据集中两个人的观察结果。

data=structure(list(id = c(2L, 2L, 2L, 3L, 3L, 3L), trt = c(1L, 1L, 
1L, 1L, 1L, 1L), status = c(0L, 0L, 0L, 2L, 2L, 2L), stage = c(3L, 
3L, 3L, 4L, 4L, 4L), spiders = c(1L, 1L, 1L, 0L, 1L, 0L), sex = structure(c(2L, 
2L, 2L, 1L, 1L, 1L), .Label = c("m", "f"), class = "factor"), 
    hepato = c(1L, 1L, 1L, 0L, 1L, 0L), edema = c(0, 0, 0, 0.5, 
    0, 0.5), ascites = c(0L, 0L, 0L, 0L, 0L, 0L)), row.names = c(NA, 
-6L), class = "data.frame")

我想按分组后计算每个人的统计模式id。我在下面使用了这段代码：

library(dplyr)
library(modeest)

    data%>%
      group_by(id)%>%mutate(edema2=mlv(edema))

我计算模式时，得到错误信息，而这种方法会奏效与其他统计参数，如mean，sd，min，max...

罗纳克·沙

您收到的警告提示有两件事。

您尚未指定method要选择的内容，因此使用默认方法'shorth'。
建议选择“模式”值。

另外，为什么不Mode从这里使用函数：

Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

要按组申请，可以将其dplyr用作：

library(dplyr)
data%>% group_by(id)%>% mutate(edema2= Mode(edema))

#     id   trt status stage spiders sex   hepato edema ascites edema2
#  <int> <int>  <int> <int>   <int> <fct>  <int> <dbl>   <int>  <dbl>
#1     2     1      0     3       1 f          1   0         0    0  
#2     2     1      0     3       1 f          1   0         0    0  
#3     2     1      0     3       1 f          1   0         0    0  
#4     3     1      2     4       0 m          0   0.5       0    0.5
#5     3     1      2     4       1 m          1   0         0    0.5
#6     3     1      2     4       0 m          0   0.5       0    0.5

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。