We provide a flexible framework for selecting among a class of additive partial linear models that allows both linear and nonlinear additive components. In practice, it is challenging to determine which additive components should be excluded from the model while simultaneously determining whether nonzero additive components should be represented as linear or non-linear components in the final model. In this paper, we propose a Bayesian model selection method that is facilitated by a carefully specified class of models, including the choice of a prior distribution and the nonparametric model used for the nonlinear additive components. We employ a series of latent variables that determine the effect of each variable among the three possibilities (no effect, linear effect, and nonlinear effect) and that simultaneously determine the knots of each spline for a suitable penalization of smooth functions. The use of a pseudo-prior distribution along with a collapsing scheme enables us to deploy well-behaved Markov chain Monte Carlo samplers, both for model selection and for fitting the preferred model. Our method and algorithm are deployed on a suite of numerical studies and are applied to a nutritional epidemiology study. The numerical results show that the proposed methodology outperforms previously available methods in terms of effective sample sizes of the Markov chain samplers and the overall misclassification rates.
翻译:我们提供了一个灵活的框架,在一组添加性部分线性模型中选择允许线性和非线性添加性添加性成分。在实践中,确定哪些添加性成分应排除在模型之外,同时确定非零添加性成分应作为线性或非线性成分在最终模型中作为线性或非线性成分。在本文件中,我们提议一种巴耶西亚模式选择方法,该方法由精心指定的模型类别加以促进,包括选择先前的分布和用于非线性添加性添加性成分的非参数模型。我们使用一系列潜在变量,确定三种可能性(无效果、线性效应和非线性效应)中每种变量的效果,同时确定每种样条的结,以适当惩罚光滑功能。使用假原质分布和崩溃计划,使我们能够部署精密的马尔科夫链蒙特卡洛取样器,用于模型选择和调整首选模式。我们的方法和算法被放在一组数字研究中,并应用于营养流行病学研究。数字结果显示,拟议的方法在有效抽样规模中超越了先前可用的整个样本比率。