Linear mixed models are widely used for analyzing hierarchically structured data involving missingness and unbalanced study designs. We consider a Bayesian cluster- ing method that combines linear mixed models and predictive projections. For each observation, we consider a predictive replicate in which only a subset of the random effects is shared between the observation and its replicate, with the remainder being integrated out using the conditional prior. Predictive projections are then defined in which the number of distinct values taken by the shared random effects is finite, in order to obtain different clusters. Integrating out some of the random effects acts as a noise filter, allowing the clustering to be focused on only certain chosen features of the data. The method is inspired by methods for Bayesian model checking, in which simulated data replicates from a fitted model are used for model criticism by examining their similarity to the observed data in relevant ways. Here the predic- tive replicates are used to define similarity between observations in relevant ways for clustering. To illustrate the way our method reveals aspects of the data at different scales, we consider fitting temporal trends in longitudinal data using Fourier cosine bases with a random effect for each basis function, and different clusterings defined by shared random effects for replicates of low or high frequency terms. The method is demonstrated in a series of real examples.
翻译:在分析涉及缺失和不平衡研究设计的等级结构化数据时,广泛使用线性混合模型,广泛使用线性混合模型,分析涉及缺失和不平衡的研究设计。我们认为,一种将线性混合模型和预测预测预测相结合的贝耶斯群集集集方法。在每次观测中,我们考虑一种预测性复制,在观测和复制之间只分享随机效应的一组,其余部分则使用附带条件的先验方法加以整合。然后对预测性预测进行定义,根据这种预测性预测,共同随机效应的不同值的数量是有限的,以便获得不同的群集。将一些随机效应整合出来,作为一种噪音过滤器,使集成只集中在数据的某些选定的特征上。这种方法的灵感来自巴耶斯模型检查方法,在这种方法中,从一个合适的模型复制出来的模拟数据以相关方式与观察到的数据相类似。在这里,使用先导型模型来界定相关随机值之间的相似性,以便获得不同的群集。为了说明我们在不同尺度上揭示数据方面的方法,我们考虑利用四重对准的对流基底基进行集中,并随机测测测测测测,每个基础函数的随机测测测测测测的时时,以各种的频率序列。通过随机测测测测测测测测测测测测测测的频率的频率的频率。