我有一个结构的数据框:
str(Ehen)
'data.frame': 412 obs. of 5 variables:
$ DATE : Date, format: "2012-09-11" "2012-09-19" ...
$ Population: Factor w/ 9 levels "Brathay","Clun",..: 4 4 4 4 4 4 4 4 4 4 ...
$ Fish : Factor w/ 3 levels "C","S","T": 2 2 2 2 2 2 2 2 2 2 ...
$ Length : int NA 70 70 80 70 60 70 60 60 70 ...
$ Width : int NA 60 50 70 60 50 60 50 50 60 ...
我要测试的是,长度通常按每个人群分配,按日期和鱼类将数据分组。
我试过了:
aggregate(Ehen$Length ~ Ehen$Fish + Ehen$DATE, FUN =shapiro.test)
Ehen$Fish Ehen$DATE Ehen$Length
1 C 2012-09-19 0.7975819
2 S 2012-09-19 0.8164554
3 S 2012-09-25 0.7935195
4 S 2012-10-04 0.9006435
5 C 2012-10-09 0.8411583
6 S 2012-10-09 0.913051
7 S 2012-10-11 0.8525953
8 C 2012-10-18 0.9084524
9 S 2012-10-18 0.9415459
10 C 2012-10-24 0.9592422
11 S 2012-10-24 0.9774688
12 C 2012-11-02 0.9536037
13 S 2012-11-02 0.9607917
14 C 2012-11-12 0.9570341
15 S 2012-11-12 0.9728865
这或多或少是我想要的,但是,如何获得Shapiro检验的p值而不是W值?
我可以逐日约会:
shapiro.test(Ehen$Length[Ehen$DATE=="2012-10-24"])
data: Ehen$Length[Ehen$DATE == "2012-10-24"]
W = 0.9761, p-value = 0.2868
但这还不够...所以我尝试了:
lapply(split(Ehen$Length, Ehen$Fish, drop = TRUE),shapiro.test)
$C
Shapiro-Wilk normality test
data: X[[1L]]
W = 0.9219, p-value = 1.548e-07
$S
Shapiro-Wilk normality test
data: X[[2L]]
W = 0.9201, p-value = 2.056e-10
但是,我不知道如何将Date作为变量包含在测试中以对数据进行子集化。
我可能一直都错了,或者我可能已经接近答案了!!先感谢您
你可以试试
res <- aggregate(cbind(P.value=Length) ~ Fish + DATE, Ehen,
FUN = function(x) shapiro.test(x)$p.value)
head(res,3)
# Fish DATE P.value
#1 C 2012-09-19 0.25510132 #####
#2 S 2012-09-19 0.11941675
#3 C 2012-09-20 0.04459457
shapiro.test(Ehen$Length[Ehen$DATE=='2012-09-19' & Ehen$Fish=='C'])
# Shapiro-Wilk normality test
#data: Ehen$Length[Ehen$DATE == "2012-09-19" & Ehen$Fish == "C"]
# W = 0.9414, p-value = 0.2551 ######
set.seed(25)
Ehen <- data.frame(DATE= sample(seq(as.Date('2012-09-19'), length.out=10,
by='1 day'), 412, replace=TRUE), Fish= sample(c("C", "S"), 412,
replace=TRUE), Length=sample(c(NA,60:80), 412,replace=TRUE))
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句