The study of deep neural networks (DNNs) in the infinite-width limit, via the so-called neural tangent kernel (NTK) approach, has provided new insights into the dynamics of learning, generalization, and the impact of initialization. One key DNN architecture remains to be kernelized, namely, the recurrent neural network (RNN). In this paper we introduce and study the Recurrent Neural Tangent Kernel (RNTK), which provides new insights into the behavior of overparametrized RNNs. A key property of the RNTK should greatly benefit practitioners is its ability to compare inputs of different length. To this end, we characterize how the RNTK weights different time steps to form its output under different initialization parameters and nonlinearity choices. A synthetic and 56 real-world data experiments demonstrate that the RNTK offers significant performance gains over other kernels, including standard NTKs, across a wide array of data sets.
翻译:通过所谓的神经相切内核(NTK)方法对无限宽度限制的深神经网络(DNN)的研究,对学习、一般化的动态以及初始化的影响提供了新的洞察力。一个关键的DNN结构仍有待封闭,即经常性神经网络(RNN)。在本文中,我们介绍并研究经常性神经相向内核(RNTK),它为过度平衡的RNNT的行为提供了新的洞察力。RNTK的一个关键特性应该是它比较不同长度的投入的能力。为此,我们描述RNTK如何在不同的初始化参数和非线性选择下权衡不同的时间步骤来形成其输出。一个合成和56个真实世界数据实验表明,RNTK在一系列广泛的数据集中对其他核心(包括标准NTK)带来显著的绩效收益。