Transfer learning is a popular technique for improving the performance of neural networks. However, existing methods are limited to transferring parameters between networks with same architectures. We present a method for transferring parameters between neural networks with different architectures. Our method, called DPIAT, uses dynamic programming to match blocks and layers between architectures and transfer parameters efficiently. Compared to existing parameter prediction and random initialization methods, it significantly improves training efficiency and validation accuracy. In experiments on ImageNet, our method improved validation accuracy by an average of 1.6 times after 50 epochs of training. DPIAT allows both researchers and neural architecture search systems to modify trained networks and reuse knowledge, avoiding the need for retraining from scratch. We also introduce a network architecture similarity measure, enabling users to choose the best source network without any training.
翻译:转让学习是改善神经网络性能的流行技术,但是,现有方法仅限于在具有相同结构的网络之间转让参数。我们提出了在不同结构的神经网络之间转让参数的方法。我们的方法叫做DPIAT,它使用动态程序来有效地匹配建筑和传输参数之间的区块和层。与现有的参数预测和随机初始化方法相比,它大大提高了培训效率和验证准确性。在图像网络的实验中,我们的方法在50个培训阶段之后平均提高了1.6倍的验证准确性。DPIAT允许研究人员和神经结构搜索系统修改经过培训的网络和再利用知识,避免从零开始再培训的需要。我们还引入了网络结构类似措施,使用户能够在不经过任何培训的情况下选择最佳源网络。