我正在尝试在 R 中的示例数据帧 (df) 中对多列运行 Kruskal wallis 测试,但我遇到了以下错误:
Error in model.frame.default(formula = as.numeric(x) ~ as.factor(Groups), :
variable lengths differ (found for 'as.factor(Groups)')
这是我的示例数据框 (df):
Groups Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Gene7 Gene8 Gene9 Gene10
Group1 120.67 69.33 1.24 2.31 0.39 6.57 2.49 383.84 415.23 NA
Group1 157 110.67 0.4 0.84 0.28 2.62 2.11 245.42 325.23 NA
Group1 113.5 66.75 1.07 4.53 0.33 2.37 2.35 421.25 352.03 73.51
Group1 131 79.67 1.13 5.03 0.72 3.36 2.24 305.32 432.81 71.11
Group1 120 79.67 0.91 3.84 0.74 3.77 1.92 298.91 382.43 66.49
Group2 125.67 83.67 2.07 1.73 0.38 3.89 2.09 233.81 377.21 72.1
Group2 103.33 68.67 1.01 4.89 0.3 4.5 1.75 231.5 381.73 53
Group2 121.33 74.67 0.54 2.39 3.95 3.7 2.46 310.66 355.97 143.61
Group2 136 83.67 1.6 1.75 0.32 5.17 2.36 410.21 389.62 170.34
Group2 143.67 71.33 0.56 1.22 0.26 4.48 2.62 294.01 491.57 96.72
Group2 134.67 69.67 0.85 1.77 0.45 3.58 2.44 236.61 441.32 69.06
Group2 158.33 98.33 0.87 3.69 0.51 2.53 2.6 257.66 396.96 41.94
Group2 147.33 88.33 NA NA NA NA NA NA NA NA
Group2 95.67 59 1.39 0.56 0.31 2.49 2.09 395.38 420.28 64.83
Group3 135 82 13.31 24.05 1.21 3.83 2.83 313.71 327.84 66.8
Group3 124.67 78 1.12 2 0.71 3.77 2.42 334.36 358.9 131.35
Group3 152 98.33 1.11 1.54 0.35 2.11 2.21 297.68 433.48 117.18
Group3 135.33 73.67 0.13 2.99 0.3 2.4 1.86 296.82 415.13 112.97
Group3 135.33 87 0.91 3.73 0.65 2.92 1.85 335.31 412.16 103.18
Group4 124.67 77.67 0.28 0.81 0.49 2.62 1.96 251.49 468.19 80.27
Group4 125.67 72.33 1.01 1.82 0.35 3.65 1.62 335.18 264.74 145.15
Group4 169 105 0.6 3.12 0.29 3.9 2.22 311.01 459.85 82.89
Group4 123.67 76.33 0.65 1.78 0.47 2.77 1.57 253.56 283.38 59.07
Group5 132.67 76.33 2.94 17.01 0.27 3.99 2.55 354.78 493.02 145.36
Group5 NA NA 1.34 1.42 0.4 4.21 2.02 243.26 345.2 43.91
Group5 144.33 75 NA NA 0.55 3.26 2.85 312.16 419.86 55.71
Group5 136.25 78.25 NA 1.32 0.65 3.63 1.52 267.13 256.18 53.49
Group5 123.67 69.33 1.81 1.52 0.67 3.89 2 303.89 346.57 112.16
Group5 116.67 66.33 0.7 1.68 0.27 3.55 2.16 284.96 407.04 102.97
Group5 136.67 76 2.68 4.3 0.33 7.36 2.26 237.28 423.29 88.65
Group6 122 63.33 0.87 4.2 0.17 3.92 2.11 159.04 300.24 60.13
Group6 130.67 82.67 0.8 1.85 1 5.26 2.46 388.61 558.51 66.76
Group6 136.33 70.33 0.54 2.26 0.35 NA NA 388.81 551.69 113.39
Group6 127.33 73 1.32 2.19 0.99 4.42 2.59 378.57 501.12 85.56
Group7 186.67 89.67 0.79 1.77 0.53 5.22 2.73 269.87 490.25 77.74
Group7 203 93 5.63 22.08 0.82 6.97 2.92 341.87 611.33 92.7
Group7 127 72.67 0.55 1.07 0.38 3.2 1.69 310.9 410.19 65.62
Group7 142 79.67 1.61 1.35 3.24 3.73 2.08 304.52 495.79 60.15
这是我的代码:
kw.tests <- lapply(
data[, -1],
function(x) { kruskal.test(as.numeric(x) ~ as.factor(Groups), data = data_test, na.action=na.omit) }
)
Error in model.frame.default(formula = as.numeric(x) ~ as.factor(Groups), :
variable lengths differ (found for 'as.factor(Groups)')
当我单独运行每个基因时,此代码运行完美,例如,对于 Gene1:
kruskal.test(Gene1 ~ as.factor(Groups), data = data_test, na.action=na.omit)
Kruskal-Wallis rank sum test
data: Gene1 by as.factor(Groups)
Kruskal-Wallis chi-squared = 5.6607, df = 6, p-value = 0.4622
但是,当我使用 lapply 甚至 for 循环时,它给了我这个错误。我已经多次用谷歌搜索过这个错误,但以下答案都没有帮助我。
我在这里发布我的数据片段:
> dput(data_test)
structure(list(Groups = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L), .Label = c("Group1",
"Group2", "Group3", "Group4", "Group5", "Group6", "Group7"), class = "factor"),
Gene1 = c(120.67, 157, 113.5, 131, 120, 125.67, 103.33, 121.33,
136, 143.67, 134.67, 158.33, 147.33, 95.67, 135, 124.67,
152, 135.33, 135.33, 124.67, 125.67, 169, 123.67, 132.67,
NA, 144.33, 136.25, 123.67, 116.67, 136.67, 122, 130.67,
136.33, 127.33, 186.67, 203, 127, 142), Gene2 = c(69.33,
110.67, 66.75, 79.67, 79.67, 83.67, 68.67, 74.67, 83.67,
71.33, 69.67, 98.33, 88.33, 59, 82, 78, 98.33, 73.67, 87,
77.67, 72.33, 105, 76.33, 76.33, NA, 75, 78.25, 69.33, 66.33,
76, 63.33, 82.67, 70.33, 73, 89.67, 93, 72.67, 79.67), Gene3 = c(1.24,
0.4, 1.07, 1.13, 0.91, 2.07, 1.01, 0.54, 1.6, 0.56, 0.85,
0.87, NA, 1.39, 13.31, 1.12, 1.11, 0.13, 0.91, 0.28, 1.01,
0.6, 0.65, 2.94, 1.34, NA, NA, 1.81, 0.7, 2.68, 0.87, 0.8,
0.54, 1.32, 0.79, 5.63, 0.55, 1.61), Gene4 = c(2.31, 0.84,
4.53, 5.03, 3.84, 1.73, 4.89, 2.39, 1.75, 1.22, 1.77, 3.69,
NA, 0.56, 24.05, 2, 1.54, 2.99, 3.73, 0.81, 1.82, 3.12, 1.78,
17.01, 1.42, NA, 1.32, 1.52, 1.68, 4.3, 4.2, 1.85, 2.26,
2.19, 1.77, 22.08, 1.07, 1.35), Gene5 = c(0.39, 0.28, 0.33,
0.72, 0.74, 0.38, 0.3, 3.95, 0.32, 0.26, 0.45, 0.51, NA,
0.31, 1.21, 0.71, 0.35, 0.3, 0.65, 0.49, 0.35, 0.29, 0.47,
0.27, 0.4, 0.55, 0.65, 0.67, 0.27, 0.33, 0.17, 1, 0.35, 0.99,
0.53, 0.82, 0.38, 3.24), Gene6 = c(6.57, 2.62, 2.37, 3.36,
3.77, 3.89, 4.5, 3.7, 5.17, 4.48, 3.58, 2.53, NA, 2.49, 3.83,
3.77, 2.11, 2.4, 2.92, 2.62, 3.65, 3.9, 2.77, 3.99, 4.21,
3.26, 3.63, 3.89, 3.55, 7.36, 3.92, 5.26, NA, 4.42, 5.22,
6.97, 3.2, 3.73), Gene7 = c(2.49, 2.11, 2.35, 2.24, 1.92,
2.09, 1.75, 2.46, 2.36, 2.62, 2.44, 2.6, NA, 2.09, 2.83,
2.42, 2.21, 1.86, 1.85, 1.96, 1.62, 2.22, 1.57, 2.55, 2.02,
2.85, 1.52, 2, 2.16, 2.26, 2.11, 2.46, NA, 2.59, 2.73, 2.92,
1.69, 2.08), Gene8 = c(383.84, 245.42, 421.25, 305.32, 298.91,
233.81, 231.5, 310.66, 410.21, 294.01, 236.61, 257.66, NA,
395.38, 313.71, 334.36, 297.68, 296.82, 335.31, 251.49, 335.18,
311.01, 253.56, 354.78, 243.26, 312.16, 267.13, 303.89, 284.96,
237.28, 159.04, 388.61, 388.81, 378.57, 269.87, 341.87, 310.9,
304.52), Gene9 = c(415.23, 325.23, 352.03, 432.81, 382.43,
377.21, 381.73, 355.97, 389.62, 491.57, 441.32, 396.96, NA,
420.28, 327.84, 358.9, 433.48, 415.13, 412.16, 468.19, 264.74,
459.85, 283.38, 493.02, 345.2, 419.86, 256.18, 346.57, 407.04,
423.29, 300.24, 558.51, 551.69, 501.12, 490.25, 611.33, 410.19,
495.79), Gene10 = c(NA, NA, 73.51, 71.11, 66.49, 72.1, 53,
143.61, 170.34, 96.72, 69.06, 41.94, NA, 64.83, 66.8, 131.35,
117.18, 112.97, 103.18, 80.27, 145.15, 82.89, 59.07, 145.36,
43.91, 55.71, 53.49, 112.16, 102.97, 88.65, 60.13, 66.76,
113.39, 85.56, 77.74, 92.7, 65.62, 60.15)), class = "data.frame", row.names = c(NA,
-38L))
任何进一步的帮助表示赞赏。感谢您。
您在 lapply/apply 调用中使用了错误的数据集名称
apply(data_test[,-1],2,function(x){kruskal.test(as.numeric(x)~as.factor(data_test$Groups))})
为我工作。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句