We introduce a novel way to combine boosting with Gaussian process and mixed effects models. This allows for relaxing, first, the zero or linearity assumption for the prior mean function in Gaussian process and grouped random effects models in a flexible non-parametric way and, second, the independence assumption made in most boosting algorithms. The former is advantageous for prediction accuracy and for avoiding model misspecifications. The latter is important for efficient learning of the fixed effects predictor function and for obtaining probabilistic predictions. Our proposed algorithm is also a novel solution for handling high-cardinality categorical variables in tree-boosting. In addition, we present an extension that scales to large data using a Vecchia approximation for the Gaussian process model relying on novel results for covariance parameter inference. We obtain increased prediction accuracy compared to existing approaches on several simulated and real-world data sets.
翻译:我们引入了一种与高斯进程和混合效应模型相结合的新颖方法,首先,可以放松高斯进程中前一个中位函数的零或线性假设,以灵活的非参数方式对随机效应模型进行分组,其次,在大多数增强算法中进行独立假设,前者有利于预测准确性和避免模型误差,后者对于有效学习固定效应预测函数和获得概率预测非常重要。我们提议的算法也是处理树起动中高心绝对变量的新办法。此外,我们提出一个扩展,即利用高斯进程模型的Vecchia近似值来对大数据进行比例比对高斯进程模型进行比对共变参数的新结果进行比对若干模拟和现实世界数据集的现有方法更准确的预测。