The Neural Tangent Kernel (NTK) characterizes the behavior of infinitely wide neural nets trained under least squares loss by gradient descent. However, despite its importance, the super-quadratic runtime of kernel methods limits the use of NTK in large-scale learning tasks. To accelerate kernel machines with NTK, we propose a near input sparsity time algorithm that maps the input data to a randomized low-dimensional feature space so that the inner product of the transformed data approximates their NTK evaluation. Our transformation works by sketching the polynomial expansions of arc-cosine kernels. Furthermore, we propose a feature map for approximating the convolutional counterpart of the NTK, which can transform any image using a runtime that is only linear in the number of pixels. We show that in standard large-scale regression and classification tasks a linear regressor trained on our features outperforms trained Neural Nets and Nystrom approximation of NTK kernel.
翻译:Neural Tangent Kernel (NTK) 描述在最小平方因梯度下降而损失的情况下训练的无限宽神经网的行为。 然而,尽管其重要性很重要, 内核方法的超二次运行时间限制了NTK在大规模学习任务中的使用。 为了加速 NTK 的内核机器, 我们提议了一个近似输入宽度时间算法, 将输入数据映射到随机的低维特征空间, 使转换数据的内部产品接近NTK 评估。 我们的转换工作是绘制弧- coine 内核的多孔扩展图。 此外, 我们提出一个特征图示, 用于绘制NTK 的相交配方的相近性图, 它可以使用运行时间转换任何图像, 运行时间仅是像素数的线性。 我们在标准的大型回归和分类任务中显示, 一个受过我们特征训练的线性反射器, 超越了经过Neural Nets 和 Nystrom nutel 的NTK 内核。