We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of function-preserving transformations between neural network specifications. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.
翻译:我们引入了将储存在一个神经网中的信息迅速传输到另一个神经网的技术,其主要目的是加速培训一个大得多的神经网。在现实世界的工作流程中,在实验和设计过程中,人们往往会培训许多不同的神经网络。这是一个浪费的过程,每个新模型都是从零开始训练的。我们的Net2Net技术通过将知识从以前的网络瞬间传输到每一个新的更深或更宽的网络来加速实验过程。我们的技术基于神经网规格之间的功能保护转换概念。这不同于以前在增加层时改变神经网所代表的功能的预培训方法。我们利用我们的知识传输机制来增加感应模块的深度,我们展示了图像网数据集的艺术准确度的新状态。