mgcv：标准错误在带有离散= true的predict.bam（）中有所不同

debugcn 发表于 Dev

德里克·鲍威尔（Derek Powell）

我安装使用的大型多层次模型bam()与discrete=TRUE速度计算。然后，我想根据该模型进行预测，同时包括或忽略某些随机效应项，我知道可以使用的terms参数来完成predict.bam()。但是，当我更改discrete选项时，发现不一致的结果，predict.bam()并且不确定哪个是正确的。

使用所有平滑术语时，一切看起来都不错。但是，当仅选择带有的平滑项时discrete = TRUE，预测值与拟合的模型相同，gam()但标准误差不同。这有时会使它们膨胀，有时会减少标准误差。使用discrete=FALSE会产生符合符合的模型的结果gam()。那么predict.bam()某处是否有bug ？哪种方法计算正确？

这是带有输出的可复制示例：

library(lme4)
library(mgcv)

data(sleepstudy)

model <- gam(Reaction ~ Days + s(Subject, bs = "re") + s(Days, Subject, bs = "re"),
                data = sleepstudy,
                method = "fREML"
)

model_d <- bam(Reaction ~ Days + s(Subject, bs = "re") + s(Days, Subject, bs = "re"),
                 data = sleepstudy,
                 method = "fREML"
                 ,discrete=TRUE
)

## including all smooth terms is fine
head(
    data.frame(
        gam1 = predict(model),
        gam2 = predict(model_d, discrete=TRUE),
        gam3 = predict(model_d, discrete=FALSE)
    )
)
# gam1     gam2     gam3
# 1 252.9178 252.9178 252.9178
# 2 272.7086 272.7086 272.7086
# 3 292.4994 292.4994 292.4994
# 4 312.2901 312.2901 312.2901
# 5 332.0809 332.0809 332.0809
# 6 351.8717 351.8717 351.8717

head(
    data.frame(
        gam1 = predict(model,se.fit=TRUE)$se.fit,
        gam2 = predict(model_d, discrete=TRUE, se.fit=TRUE)$se.fit,
        gam3 = predict(model_d, discrete=FALSE, se.fit=TRUE)$se.fit
    )
)
# gam1      gam2      gam3
# 1 12.410215 12.410215 12.410215
# 2 10.660886 10.660886 10.660886
# 3  9.191220  9.191220  9.191220
# 4  8.153867  8.153867  8.153867
# 5  7.724996  7.724996  7.724996
# 6  8.003034  8.003034  8.003034

## ---- selecting only some smooth terms
## with discrete = TRUE, predicted values are the same but 
## standard errors returned are the same as those with all smooths included.
## This sometimes inflates them and sometimes reduces them.

head(
    data.frame(
        gam1 = predict(model, terms=c("s(Subject)")),
        gam2 = predict(model_d, terms=c("s(Subject)")),
        gam3 = predict(model_d, terms=c("s(Subject)"))
    )
)

# gam1     gam2     gam3
# 1 252.9178 252.9178 252.9178
# 2 263.3851 272.7086 272.7086
# 3 273.8524 292.4994 292.4994
# 4 284.3197 312.2901 312.2901
# 5 294.7869 332.0809 332.0809
# 6 305.2542 351.8717 351.8717

head(
    data.frame(
        gam1 = predict(model, terms=c("s(Subject)"),se.fit=TRUE)$se.fit,
        gam2 = predict(model_d, terms=c("s(Subject)"), discrete=TRUE, se.fit=TRUE)$se.fit,
        gam3 = predict(model_d, terms=c("s(Subject)"), discrete=FALSE, se.fit=TRUE)$se.fit
    )
)

# gam1      gam2     gam3
# 1 12.41021 12.410215 12.41021
# 2 12.34846 10.660886 12.34846
# 3 12.48280  9.191220 12.48280
# 4 12.80704  8.153867 12.80704
# 5 13.30733  7.724996 13.30733
# 6 13.96474  8.003034 13.96474

head(
    data.frame(
        gam1 = predict(model, terms=c("s(Days, Subject)"),se.fit=TRUE)$se.fit,
        gam2 = predict(model_d, terms=c("s(Days, Subject)"), discrete=TRUE, se.fit=TRUE)$se.fit,
        gam3 = predict(model_d, terms=c("s(Days, Subject)"), discrete=FALSE, se.fit=TRUE)$se.fit
    )
)

# gam1      gam2     gam3
# 1 6.885381 12.410215 6.885381
# 2 6.773449 10.660886 6.773449
# 3 7.015357  9.191220 7.015357
# 4 7.577292  8.153867 7.577292
# 5 8.395234  7.724996 8.395234
# 6 9.402609  8.003034 9.402609

更新

我做了更多的挖掘工作，也许已经回答了我自己的问题，但是仍然可以感谢任何可以提供更多专业知识的人。

我尝试使用Gavin Simpson撰写的predict(..., method="lpmatrix")这篇有益的博客文章来手动计算事物。

看来discrete=TRUE输出只是错误，这是某种错误。

此代码从先前的代码继续：


### ---- manual computation with simulation via lpmatrix
mvrnorm <- MASS::mvrnorm

lp <- predict(model_d, type = "lpmatrix")

coefs <- coef(model_d)
vc <- vcov(model_d)

set.seed(123)
sim <- mvrnorm(5e4, mu = coefs, Sigma = vc)

fits <- lp %*% t(sim)

se.fit <- apply(fits, 1, sd)

## with all effects
head(
    data.frame(
        gam1 = predict(model,se.fit=TRUE)$se.fit,
        gam2 = predict(model_d, discrete=TRUE, se.fit=TRUE)$se.fit,
        gam3 = predict(model_d, discrete=FALSE, se.fit=TRUE)$se.fit,
        man = se.fit
    )
)
# gam1      gam2      gam3       man
# 1 12.410220 12.410215 12.410215 12.453005
# 2 10.660891 10.660886 10.660886 10.704449
# 3  9.191224  9.191220  9.191220  9.235621
# 4  8.153871  8.153867  8.153867  8.198276
# 5  7.724998  7.724996  7.724996  7.767261
# 6  8.003034  8.003034  8.003034  8.040678


## ---- with only s(Subject) random effects
want <- c(c(1,2), grep("s\\(Subject\\)", colnames(lp))) # regex is obnoxious here
fits <- lp[, want] %*% t(sim[, want])

se.fit <- apply(fits, 1, sd)

head(
    data.frame(
        gam1 = predict(model, terms=c("s(Subject)"),se.fit=TRUE)$se.fit,
        gam2 = predict(model_d, terms=c("s(Subject)"), discrete=TRUE, se.fit=TRUE)$se.fit,
        gam3 = predict(model_d, terms=c("s(Subject)"), discrete=FALSE, se.fit=TRUE)$se.fit,
        man = se.fit
    )
)

# gam1      gam2     gam3      man
# 1 12.41022 12.410215 12.41021 12.45300
# 2 12.34847 10.660886 12.34846 12.39594
# 3 12.48280  9.191220 12.48280 12.53395
# 4 12.80704  8.153867 12.80704 12.86074
# 5 13.30733  7.724996 13.30733 13.36248
# 6 13.96474  8.003034 13.96474 14.02039

西蒙·伍德

这是离散预测代码中的错误。已针对mgcv_1.8-32修复。谢谢！西蒙

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-04-2

我来说两句

0条评论

登录后参与评论

来自分类Dev

Related 相关文章

文章

mgcv：标准错误在带有离散= true的predict.bam（）中有所不同

mgcv：标准错误在带有离散= true的predict.bam（）中有所不同

更新

Mysql版本在phpmyadmin中有所不同

jQuery对象在变量中有所不同

熊猫列索引在 iloc 中有所不同

是否在if语句中使用'if（variable == true）'与'if（variable）'有所不同？

静态Google地图坐标在不同的浏览器中有所不同

双精度在不同语言中有所不同

静态Google地图坐标在不同的浏览器中有所不同

为什么“这”在两个不同的功能中有所不同？

Powershell cmdlet错误在脚本和控制台命令中有所不同

SQL SUMPRODUCT有所不同

语言与预期的有所不同

UIScrollView的行为在iOS8中有所不同

Matplotlib希腊符号在show（）和savefig（）中有所不同

用javascript创建的Div结构在实际DOM中有所不同

为什么这些明确的表达式在JavaScript中有所不同？

UIAlertView中取消按钮的顺序在iOS8中有所不同

文件大小在Chrome网络面板中有所不同

SVG过滤器在Chrome中有所不同

Base64编码的字符串在perl和Java中有所不同

NUnit结果在调试和发布中有所不同

为什么小数值的反t分布在Matlab和R中有所不同？

jQuery-Regex按键操作在Chrome和Firefox中有所不同

K均值结果索引在第二次运行中有所不同

Scrapy筛选器相同的URL在“ http”和“ https”中有所不同

RSA-SHA1签名在JavaScript和PHP中有所不同

数组的大小在主函数和用户定义函数中有所不同

python中的多维数据集根在外壳中有所不同

正则表达式的格式在AEM调度程序中有所不同

透明编辑文本视图在2.3版和4.2版中有所不同