Additive Gaussian process (GP) models offer flexible tools for modelling complex non-linear relationships and interaction effects among covariates. While most studies have focused on predictive performance, relatively little attention has been given to identifying the underlying interaction structure, which may be of scientific interest in many applications. In practice, the use of additive GP models in this context has been limited by the cubic computational cost and quadratic storage requirements of GP inference. This paper presents a fast hierarchical additive interaction GP model for multi-dimensional grid data. A hierarchical ANOVA decomposition kernel forms the foundation of our model, which incorporate main and interaction effects under the principle of marginality. Kernel centring ensures identifiability and provides a unique, interpretable decomposition of lower- and higher-order effects. For datasets forming a multi-dimensional grid, efficient implementation is achieved by exploiting the Kronecker product structure of the covariance matrix. Our contribution is to extend Kronecker-based computation to handle any interaction structure within the proposed class of hierarchical additive GP models, whereas previous methods were limited to separable or fully saturated cases. The benefits of the proposed approach are demonstrated through simulation studies and an application to high-frequency nitrogen dioxide concentration data in London.
翻译:可加高斯过程模型为协变量间复杂的非线性关系及交互效应建模提供了灵活工具。尽管多数研究聚焦于预测性能,但识别潜在的交互结构——这在许多应用中具有科学意义——却相对受到较少关注。实践中,高斯过程推断的立方计算复杂度与二次存储需求限制了可加高斯过程模型在此类场景中的应用。本文提出一种面向多维网格数据的快速分层可加交互高斯过程模型。模型以分层方差分析分解核为基础,在边际性原则下纳入主效应与交互效应。核中心化处理确保了可识别性,并为低阶与高阶效应提供了唯一且可解释的分解。对于构成多维网格的数据集,通过利用协方差矩阵的克罗内克积结构实现了高效计算。本研究的贡献在于将基于克罗内克积的计算方法扩展至所提出的分层可加高斯过程模型类别中的任意交互结构,而先前方法仅适用于可分离或完全饱和的情形。通过模拟研究及对伦敦高频二氧化氮浓度数据的应用,验证了所提方法的优势。