We propose Dirichlet Process Mixture (DPM) models for prediction and cluster-wise variable selection, based on two choices of shrinkage baseline prior distributions for the linear regression coefficients, namely the Horseshoe prior and Normal-Gamma prior. We show in a simulation study that each of the two proposed DPM models tend to outperform the standard DPM model based on the non-shrinkage normal prior, in terms of predictive, variable selection, and clustering accuracy. This is especially true for the Horseshoe model, and when the number of covariates exceeds the within-cluster sample size. A real data set is analyzed to illustrate the proposed modeling methodology, where both proposed DPM models again attained better predictive accuracy.
翻译:我们根据线性回归系数(即以前马休和以前正常伽马)的两种缩小基线先前分布选择,提出了用于预测和集群变量选择的Drichlet进程混合模型(DPM),我们在模拟研究中显示,在预测、变量选择和组合精度方面,两个拟议的DPM模型在预测、变量选择和组合精度方面,都往往优于以前正常的DPM标准模型。对于马休模型,以及当共变模型的数量超过组内样本大小时,尤其如此。我们分析了一个真实的数据集,以说明拟议的模型方法,其中两个拟议的DPM模型都再次实现了更准确的预测。