A general, {\em rectangular} kernel matrix may be defined as $K_{ij} = \kappa(x_i,y_j)$ where $\kappa(x,y)$ is a kernel function and where $X=\{x_i\}_{i=1}^m$ and $Y=\{y_i\}_{i=1}^n$ are two sets of points. In this paper, we seek a low-rank approximation to a kernel matrix where the sets of points $X$ and $Y$ are large and are not well-separated (e.g., the points in $X$ and $Y$ may be ``intermingled''). Such rectangular kernel matrices may arise, for example, in Gaussian process regression where $X$ corresponds to the training data and $Y$ corresponds to the test data. In this case, the points are often high-dimensional. Since the point sets are large, we must exploit the fact that the matrix arises from a kernel function, and avoid forming the matrix, and thus ruling out most algebraic techniques. In particular, we seek methods that can scale linearly, i.e., with computational complexity $O(m)$ or $O(n)$ for a fixed accuracy or rank. The main idea in this paper is to {\em geometrically} select appropriate subsets of points to construct a low rank approximation. An analysis in this paper guides how this selection should be performed.
翻译:普通 { { { { { { } } 内核矩阵可以定义为 $K} = kappa (x_ i, y_ j)$ $\ kapa (x,y_j) $ $\ kapa (x,y) 美元是一个内核函数, $X*xx_ i ⁇ i= 1 美元和 $Y_ i i= 1 美元是两组点 。 在本文中, 我们寻求一个低级近距离到一个内核矩阵, 其中点为 $X$ 和 $Y} = = $, = = $, = = $, = = $, = = $, = = $, = = = $, = = $, = $, = = $ = = $, $, = $, = $ ; 在内核矩阵矩阵中, =, = 等于, 等于 等于 等于, 等于 等于, 美元 = 美元 = = = = = = 美元 = = = = = = = = = 美元 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =