我正在尝试计算我在回归的截距和斜率上生成的一组残留自举副本的覆盖率。谁能告诉我如何计算置信区间的覆盖率?非常感谢。
请注意,我使用Qr分解手动运行了回归,但lm()
如果更简单,则可以使用。我只是认为手动执行会更快。
set.seed(42) ## for sake of reproducibility
n <- 100
x <- rnorm(n)
e <- rnorm(n)
y <- as.numeric(50 + 25*x + e)
dd <- data.frame(id=1:n, x=x, y=y)
mo <- lm(y ~ x, data=dd)
# Manual Residual Bootstrap
resi <- residuals(mo)
fit <- fitted(mo)
ressampy <- function() fit + sample(resi, length(resi), replace=TRUE)
# Sample y values:
head(ressampy())
# Qr decomposition of X values
qrX <- qr(cbind(Intercept=1, dd[, "x", drop=FALSE]), LAPACK=TRUE)
# faster than LM
qr.coef(qrX, dd[, "y"])
# One Bootstrap replication
boot1 <- qr.coef(qrX, ressampy())
# 1000 bootstrap replications
boot <- t(replicate(1000, qr.coef(qrX, ressampy())))
您可以首先使用confint
“ HC2”健壮的vcov计算CI ,
mo <- lm(y ~ x, data=dd)
# ci <- confint(mo)
## or
library(lmtest);library(sandwich)
ci <- confint(coeftest(mo, vcov.=vcovHC(mo, type="HC3"))) ## probably better
然后假设正态分布,则quantile
在升压时计算.025和.975 s FUN
。matrixStats::colQuantiles
会很好并且快速地做到这一点。
FUN <- function(qrX) {
b <- t(replicate(200, qr.coef(qrX, ressampy())))
matrixStats::colQuantiles(b, probs=c(.025, .975))
}
set.seed(42)
R <- 200
res <- replicate(R, FUN(qrX), simplify=F)
## where one iteration looks like this
res[[1]]
# 2.5% 97.5%
# Intercept 49.77095 50.09592
# x 24.83468 25.20457
## grab results for intercept and x
icc <- t(sapply(res, function(x, y) x[grep(y, rownames(x)), ], "Intercept"))
cfc <- t(sapply(res, function(x, y) x[grep(y, rownames(x)), ], "x"))
最后,data.table::between
为了方便起见,请检查引导程序的配置项相对于引导程序R
重复数位于原始配置项之间的次数,以获得覆盖概率。
## intercept
sum(apply(icc, 1, function(x) {
all(data.table::between(x, ci[1,1], ci[1,2]))
}))/R
# [1] 0.536
## x
sum(apply(cfc, 1, function(x) {
all(data.table::between(x, ci[2,1], ci[2,2]))
}))/R
# [1] 0.796
要仅使用基数R,您可以代替 matrixStats::colQuantiles
matrixStats::colQuantiles(b, probs=c(.025, .975))
# 2.5% 97.5%
# Intercept 49.24900 50.12374
# x 25.02903 25.68369
t(apply(b, 2, quantile, probs=c(.025, .975)))
# 2.5% 97.5%
# Intercept 49.24900 50.12374
# x 25.02903 25.68369
而不是data.table::between
:
data.table::between(1:5, 2, 4)
# [1] FALSE TRUE TRUE TRUE FALSE
1:5 >= 2 & 1:5 <= 4
# [1] FALSE TRUE TRUE TRUE FALSE
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句