Latent Gaussian models and boosting are widely used techniques in statistics and machine learning. Tree-boosting shows excellent predictive accuracy on many data sets, but potential drawbacks are that it assumes conditional independence of samples, produces discontinuous predictions for, e.g., spatial data, and it can have difficulty with high-cardinality categorical variables. Latent Gaussian models, such as Gaussian process and grouped random effects models, are flexible prior models that allow for making probabilistic predictions. However, existing latent Gaussian models usually assume either a zero or a linear prior mean function which can be an unrealistic assumption. This article introduces a novel approach that combines boosting and latent Gaussian models in order to remedy the above-mentioned drawbacks and to leverage the advantages of both techniques. We obtain increased predictive accuracy compared to existing approaches in both simulated and real-world data experiments.
翻译:原始高斯模型和推进是统计和机器学习中广泛使用的技术。 植树催生显示许多数据集的预测准确性极强,但潜在的缺点是,它假定样品有条件独立,对空间数据等数据作出不连续的预测,而且它可能难以应付高心绝对变量。 延迟高斯模型,如高斯进程和组合随机效应模型,是灵活的先期模型,可以进行概率预测。 但是,现有的潜伏高斯模型通常假定零或线性前中值函数,这可能是不现实的假设。 文章介绍了一种新颖的方法,将推进模型和潜伏高斯模型结合起来,以纠正上述缺陷,并利用这两种技术的优势。 与模拟和现实世界数据实验中的现有方法相比,我们获得了更高的预测准确性。