The Neural Tangent Kernel (NTK) characterizes the behavior of infinitely wide neural nets trained under least squares loss by gradient descent (Jacot et al., 2018). However, despite its importance, the super-quadratic runtime of kernel methods limits the use of NTK in large-scale learning tasks. To accelerate kernel machines with NTK, we propose a near input sparsity time algorithm that maps the input data to a randomized low-dimensional feature space so that the inner product of the transformed data approximates their NTK evaluation. Furthermore, we propose a feature map for approximating the convolutional counterpart of the NTK (Arora et al., 2019), which can transform any image using a runtime that is only linear in the number of pixels. We show that in standard large-scale regression and classification tasks a linear regressor trained on our features outperforms trained NNs and Nystrom method with NTK kernels.
翻译:Neural Tangent Kernel (NTK) 描述在梯度下降最小平方损失下训练的无限宽神经网的行为(Jacot等人,2018年) 。 但是,尽管其重要性, 内核方法的超赤道运行时间限制了NTK在大规模学习任务中的使用。 为了加速 NTK 的内核机器, 我们提议了一个接近输入宽度时间算法, 将输入数据映射到随机的低维特征空间, 以便转换数据的内部产品接近 NTK 评估。 此外, 我们提议了NTK 相近的相联方( Arora等人, 2019年) 的地貌图, 它可以使用在像素数量上只有线性的运行时间来改变任何图像。 我们显示, 在标准的大规模回归和分类任务中, 一个受过培训的线性回归器, 其特征是经过NNS和Nystrom方法训练的NTK内核。