Riemannian manifolds provide a principled way to model nonlinear geometric structure inherent in data. A Riemannian metric on said manifolds determines geometry-aware shortest paths and provides the means to define statistical models accordingly. However, these operations are typically computationally demanding. To ease this computational burden, we advocate probabilistic numerical methods for Riemannian statistics. In particular, we focus on Bayesian quadrature (BQ) to numerically compute integrals over normal laws on Riemannian manifolds learned from data. In this task, each function evaluation relies on the solution of an expensive initial value problem. We show that by leveraging both prior knowledge and an active exploration scheme, BQ significantly reduces the number of required evaluations and thus outperforms Monte Carlo methods on a wide range of integration problems. As a concrete application, we highlight the merits of adopting Riemannian geometry with our proposed framework on a nonlinear dataset from molecular dynamics.
翻译:Riemannian 元件为模拟数据所固有的非线性几何结构提供了原则性的方法。 有关上述元件的Riemannian 度量仪决定了最短的几何能路径,并提供了相应界定统计模型的手段。 但是,这些操作通常在计算上要求很高。 为了减轻这一计算负担,我们主张为Riemannian 统计数据采用概率数字方法。 特别是, 我们侧重于Bayesian 二次曲线(BQ), 以数字方式对从数据中学习的Riemannian 元件的正常法则进行综合计算。 在这项工作中, 每一个函数评价都依赖于一个昂贵的初始值问题的解决方案。 我们表明, 通过利用先前的知识和积极的探索计划, BQ 大大减少了所需的评价数量, 从而在广泛的整合问题上超越了蒙特卡洛方法。 作为具体应用, 我们强调在分子动态的非线性数据集上采用Riemannian 几何方法的好处。