Gaussian processes (GPs) furnish accurate nonlinear predictions with well-calibrated uncertainty. However, the typical GP setup has a built-in stationarity assumption, making it ill-suited for modeling data from processes with sudden changes, or "jumps" in the output variable. The "jump GP" (JGP) was developed for modeling data from such processes, combining local GPs and latent "level" variables under a joint inferential framework. But joint modeling can be fraught with difficulty. We aim to simplify by suggesting a more modular setup, eschewing joint inference but retaining the main JGP themes: (a) learning optimal neighborhood sizes that locally respect manifolds of discontinuity; and (b) a new cluster-based (latent) feature to capture regions of distinct output levels on both sides of the manifold. We show that each of (a) and (b) separately leads to dramatic improvements when modeling processes with jumps. In tandem (but without requiring joint inference) that benefit is compounded, as illustrated on real and synthetic benchmark examples from the recent literature.
翻译:高斯过程(GPs)能够提供精确的非线性预测并具备良好校准的不确定性。然而,典型的高斯过程设置内置了平稳性假设,使其不适合建模具有输出变量突变(即“跳跃”)过程的数据。为对此类过程的数据进行建模,研究者提出了“跳跃高斯过程”(JGP),其在联合推断框架下结合了局部高斯过程与潜在“层级”变量。但联合建模往往面临诸多困难。本文旨在通过提出一种更模块化的设置来简化问题,摒弃联合推断,同时保留JGP的核心思想:(a)学习最优邻域大小,使其在局部尊重不连续流形;(b)引入一种新的基于聚类的(潜在)特征,以捕捉流形两侧不同输出水平的区域。我们证明,在建模具有跳跃的过程中,(a)和(b)各自均能带来显著改进。当两者协同作用时(无需联合推断),这种效益会进一步叠加,这在近期文献中的真实与合成基准示例中得到了验证。