A Gaussian process (GP) is a powerful and widely used regression technique. The main building block of a GP regression is the covariance kernel, which characterizes the relationship between pairs in the random field. The optimization to find the optimal kernel, however, requires several large-scale and often unstructured matrix inversions. We tackle this challenge by introducing a hierarchical matrix approach, named HMAT, which effectively decomposes the matrix structure, in a recursive manner, into significantly smaller matrices where a direct approach could be used for inversion. Our matrix partitioning uses a particular aggregation strategy for data points, which promotes the low-rank structure of off-diagonal blocks in the hierarchical kernel matrix. We employ a randomized linear algebra method for matrix reduction on the low-rank off-diagonal blocks without factorizing a large matrix. We provide analytical error and cost estimates for the inversion of the matrix, investigate them empirically with numerical computations, and demonstrate the application of our approach on three numerical examples involving GP regression for engineering problems and a large-scale real dataset. We provide the computer implementation of GP-HMAT, HMAT adapted for GP likelihood and derivative computations, and the implementation of the last numerical example on a real dataset. We demonstrate superior scalability of the HMAT approach compared to built-in $\backslash$ operator in MATLAB for large-scale linear solves $\bf{A}\bf{x} = \bf{y}$ via a repeatable and verifiable empirical study. An extension to hierarchical semiseparable (HSS) matrices is discussed as future research.
翻译:{GP] 是一个强大且广泛使用的回归技术。 GP 回归的主要构件是常态内核,这是随机字段中对对等关系的特点。 然而,找到最佳内核的最优化需要若干大规模且往往是结构化的矩阵反转。 我们通过采用等级矩阵方法来应对这一挑战,名为 HMAT, 以循环的方式有效地将矩阵结构分解成相当小的基体,在那里可以采用直接反转的方法。 我们的矩阵分区对数据点采用特殊的组合战略,这有利于在上层内层内对非对等区进行低级结构结构。 然而,我们采用随机化的线形变数计算法方法,在不考虑大型矩阵的情况下减少低层对非对立矩阵的矩阵。 我们用数字计算法对矩阵结构结构进行分析错误和成本估算,并展示我们在三个数字例子上采用的方法,涉及GPB在工程问题上进行累进式递减,以及大规模对等离值的内值内值矩阵矩阵矩阵结构。 我们用随机的线性变数计算方法, 用于对数字性AAT 进行最后的升级数据分析, 和数字性变数性变数性GMDHDS 数据分析,我们用S 的升级的升级数据分析, 的数值分析后演演算的数值分析, 数据推算算。