R-INLA: 准确性和可复制性,对COVID-19数据的分析有影响 (Computing with R-INLA: Accuracy and reproducibility with implications for the analysis of COVID-19 data)

The statistical methods used to analyze medical data are becoming increasingly complex. Novel statistical methods increasingly rely on simulation studies to assess their validity. Such assessments typically appear in statistical or computational journals, and the methodology is later introduced to the medical community through tutorials. This can be problematic if applied researchers use the methodologies in settings that have not been evaluated. In this paper, we explore a case study of one such method that has become popular in the analysis of coronavirus disease 2019 (COVID-19) data. The integrated nested Laplace approximations (INLA), as implemented in the R-INLA package, approximates the marginal posterior distributions of target parameters that would have been obtained from a fully Bayesian analysis. We seek to answer an important question: Does existing research on the accuracy of INLA's approximations support how researchers are currently using it to analyze COVID-19 data? We identify three limitations to work assessing INLA's accuracy: 1) inconsistent definitions of accuracy, 2) a lack of studies validating how researchers are actually using INLA, and 3) a lack of research into the reproducibility of INLA's output. We explore the practical impact of each limitation with simulation studies based on models and data used in COVID-19 research. Our results suggest existing methods of assessing the accuracy of the INLA technique may not support how COVID-19 researchers are using it. Guided in part by our results, we offer a proposed set of minimum guidelines for researchers using statistical methodologies primarily validated through simulation studies.

翻译：用于分析医疗数据的统计方法正变得越来越复杂。新颖的统计方法越来越依赖模拟研究来评估其有效性。这类评估通常出现在统计学或计算学期刊上,后来通过辅导向医疗界介绍方法。如果应用的研究人员在未经评估的环境下使用方法,这可能会有问题。在本文中,我们探索了在分析2019年科罗纳病毒(COVID-19)数据中流行的一种方法的案例研究。在R-INLA软件包中实施的综合嵌巢式拉普尔近似(INLA),这些评估通常出现在统计学或计算学期刊上,并接近了目标参数的边际外表分布。我们试图回答一个重要问题:关于应用国际实验室的精确性的现有研究是否支持研究人员目前如何使用该方法分析COVI-19数据?我们找出了评估INLA数据的准确性工作中的三个限制:(1) 准确性定义不一致,(2) 缺乏验证研究人员如何实际使用INLA的精确性研究指南,以及(3) 缺乏对使用国际实验室研究所现有数据结果的精确性研究的每一项分析方法的精确性的研究。