Multilevel linear models allow flexible statistical modelling of complex data with different levels of stratification. Identifying the most appropriate model from the large set of possible candidates is a challenging problem. In the Bayesian setting, the standard approach is a comparison of models using the model evidence or the Bayes factor. However, in all but the simplest of cases, direct computation of these quantities is impossible. Markov Chain Monte Carlo approaches are widely used, such as sequential Monte Carlo, but it is not always clear how well such techniques perform in practice. We present an improved method for estimation of the log model evidence, by an intermediate analytic computation of a marginal likelihood, integrated over non-variance parameters. This reduces the dimensionality of the Monte Carlo sampling algorithm, which in turn yields more consistent estimates. We illustrate this method on a popular multilevel dataset containing levels of radon in homes in the US state of Minnesota.
翻译:多层次线性模型允许对复杂数据进行灵活的统计建模,其分层程度不同。从大量可能的候选者中确定最适当的模型是一个具有挑战性的问题。在巴伊西亚环境中,标准办法是比较使用模型证据或贝雅系数的模型。然而,除了最简单的案例外,不可能直接计算这些数量。Markov 链条蒙特卡洛(Markov Caincle Monte Carlo)方法被广泛使用,例如相继的蒙特卡洛(Monte Carlo)方法,但并不总是清楚这些技术在实践中的运作情况。我们提出了一个更好的方法,通过对一种边际可能性进行中间分析计算来估计日志模型证据。这降低了蒙特卡洛取样算法的维度,这反过来得出了更加一致的估计数。我们用包含美国明尼苏达州家庭拉德水平的流行多层次数据集来说明这一方法。