The inductive biases of trained neural networks are difficult to understand and, consequently, to adapt to new settings. We study the inductive biases of linearizations of neural networks, which we show to be surprisingly good summaries of the full network functions. Inspired by this finding, we propose a technique for embedding these inductive biases into Gaussian processes through a kernel designed from the Jacobian of the network. In this setting, domain adaptation takes the form of interpretable posterior inference, with accompanying uncertainty estimation. This inference is analytic and free of local optima issues found in standard techniques such as fine-tuning neural network weights to a new task. We develop significant computational speed-ups based on matrix multiplies, including a novel implementation for scalable Fisher vector products. Our experiments on both image classification and regression demonstrate the promise and convenience of this framework for transfer learning, compared to neural network fine-tuning. Code is available at https://github.com/amzn/xfer/tree/master/finite_ntk.
翻译:受过训练的神经网络的诱导偏向难以理解,因此难以适应新的环境。我们研究了神经网络线性化的诱导偏向,我们发现这种偏向令人惊讶地很好地概括了整个网络功能。受这一发现启发,我们提出一种方法,通过网络Jacobian设计的内核将这些诱导偏向嵌入高斯进程。在这个环境中,域的调整采取可解释的远地点推法的形式,并伴有不确定性的估计。这种推论是分析性的,没有在标准技术(例如微调神经网络重量到新任务的标准技术)中发现的地方opima问题。我们开发了基于矩阵多功能的重大计算加速,包括对可缩放的渔业矢量产品进行新的实施。我们在图像分类和回归方面的实验显示了与神经网络微调相比这一转移学习框架的希望和方便性。代码可在 https://github.com/amzn/xfer/tree/master/fin_ntk查阅。