The training of high-dimensional regression models on comparably sparse data is an important yet complicated topic, especially when there are many more model parameters than observations in the data. From a Bayesian perspective, inference in such cases can be achieved with the help of shrinkage prior distributions, at least for generalized linear models. However, real-world data usually possess multilevel structures, such as repeated measurements or natural groupings of individuals, which existing shrinkage priors are not built to deal with. We generalize and extend one of these priors, the R2-D2 prior by Zhang et al. (2020), to linear multilevel models leading to what we call the R2-D2-M2 prior. The proposed prior enables both local and global shrinkage of the model parameters. It comes with interpretable hyperparameters, which we show to be intrinsically related to vital properties of the prior, such as rates of concentration around the origin, tail behavior, and amount of shrinkage the prior exerts. We offer guidelines on how to select the prior's hyperparameters by deriving shrinkage factors and measuring the effective number of non-zero model coefficients. Hence, the user can readily evaluate and interpret the amount of shrinkage implied by a specific choice of hyperparameters. Finally, we perform extensive experiments on simulated and real data, showing that our prior is well calibrated, has desirable global and local regularization properties and enables the reliable and interpretable estimation of much more complex Bayesian multilevel models than was previously possible.
翻译:在可比较的稀少数据上培训高维回归模型是一个重要但复杂的专题,特别是当数据中比观测更多的模型参数时。从巴伊西亚的角度来看,这类情况下的推论可以借助于缩小先前的分布,至少对于通用线性模型而言。然而,现实世界数据通常具有多层次的结构,例如反复测量或个人自然组合,而现有的缩缩缩前研究并不是要处理的。我们推广并扩展了其中一个前题,即张等人(202020年)之前的R2-D2,到直线多层次模型,导致我们以前称为R2-D2-M2的模型。从巴伊西亚的角度看,这类案例的推论可以使模型的本地和全球缩缩缩缩。我们所展示的超大参数与先前的关键特性有着内在的联系,例如围绕源的集中率、尾部行为和先前的缩缩放量。我们就如何通过测算缩放的缩放系数和测量非ZZO值的可靠模型数量提供了指导方针。我们以前提出的前期的缩放多层次模型和精确的缩略度模型,因此,我们用户能够对前期的缩略度进行更精确的缩略地分析。