We study fast algorithms for computing fundamental properties of a positive semidefinite kernel matrix $K \in \mathbb{R}^{n \times n}$ corresponding to $n$ points $x_1,\ldots,x_n \in \mathbb{R}^d$. In particular, we consider estimating the sum of kernel matrix entries, along with its top eigenvalue and eigenvector. We show that the sum of matrix entries can be estimated to $1+\epsilon$ relative error in time $sublinear$ in $n$ and linear in $d$ for many popular kernels, including the Gaussian, exponential, and rational quadratic kernels. For these kernels, we also show that the top eigenvalue (and an approximate eigenvector) can be approximated to $1+\epsilon$ relative error in time $subquadratic$ in $n$ and linear in $d$. Our algorithms represent significant advances in the best known runtimes for these problems. They leverage the positive definiteness of the kernel matrix, along with a recent line of work on efficient kernel density estimation.
翻译:我们研究快速算法,以计算正半无限期内核基质基质的基本特性 $K $K\ in\mathbb{R ⁇ n\timen n}$,相当于$x_1,\ldots,x_n\ in\mathbb{R ⁇ d$。特别是,我们考虑估计内核基质条目的总和,连同其顶值和顶值。我们显示,矩阵条目的总和可以估计为:对于许多流行的内核,包括高斯、指数和理性的二次内核,在时间上值为$n美元和直线值为$1 epsilon的相对错误。对于高斯、指数和理性的二次内核核核核核核核,我们的算法代表了在已知的最佳运行时间里程中取得的重大进步。对于这些问题,我们还表明,顶级的内核元值(大约为$n美元和线值为$d$)。