使用随机效应模型和预测预测预测的贝耶斯群集 (Bayesian clustering using random effects models and predictive projections)

Linear mixed models are widely used for analyzing hierarchically structured data involving missingness and unbalanced study designs. We consider a Bayesian clustering method that combines linear mixed models and predictive projections. For each observation, we consider a predictive replicate in which only a subset of the random effects is shared between the observation and its replicate, with the remainder being integrated out using the conditional prior. Predictive projections are then defined in which the number of distinct values taken by the shared random effects is finite, in order to obtain different clusters. Integrating out some of the random effects acts as a noise filter, allowing the clustering to be focused on only certain chosen features of the data. The method is inspired by methods for Bayesian model checking, in which simulated data replicates from a fitted model are used for model criticism by examining their similarity to the observed data in relevant ways. Here the predictive replicates are used to define similarity between observations in relevant ways for clustering. To illustrate the way our method reveals aspects of the data at different scales, we consider fitting temporal trends in longitudinal data using Fourier cosine bases with a random effect for each basis function, and different clusterings defined by shared random effects for replicates of low or high frequency terms. The method is demonstrated in a series of real examples.

翻译：在分析涉及缺失和不平衡研究设计的等级结构化数据时,广泛使用线性混合模型和线性研究设计,广泛使用线性混合模型。我们考虑贝叶斯群集方法,将线性混合模型和预测预测预测性预测结合起来。在每次观测中,我们考虑预测性复制,在观测及其复制之间只共享随机效应的一组随机效应,其余则使用附带条件的先验数据进行整合。然后对预测性预测进行定义,在这种预测性预测中,共享随机效应的不同值的数量是有限的,以便获得不同的群集。将一些随机效应整合出来,作为噪音过滤器,使集成只侧重于数据的某些选定的特征。这种方法的灵感来自贝叶斯群模型检查方法,在这种方法中,通过以相关方式检查与观察到的数据相似性,将经过模拟的模型复制数据用于模型批评。在这里,预测性复制用于确定相关随机效应的观测的相似性,以便获得不同的群集。为了说明我们的方法如何在不同尺度上显示数据的各个方面,我们考虑利用具有随机性效果的长度数据中的时间趋势,我们考虑利用具有随机效果的四比基质基函数的基数,并且以不同的高频序列中以展示了不同的方法。