This work presents a naive algorithm for parameter transfer between different architectures with a computationally cheap injection technique (which does not require data). The primary objective is to speed up the training of neural networks from scratch. It was found in this study that transferring knowledge from any architecture was superior to Kaiming and Xavier for initialization. In conclusion, the method presented is found to converge faster, which makes it a drop-in replacement for classical methods. The method involves: 1) matching: the layers of the pre-trained model with the targeted model; 2) injection: the tensor is transformed into a desired shape. This work provides a comparison of similarity between the current SOTA architectures (ImageNet), by utilising TLI (Transfer Learning by Injection) score.
翻译:这项工作为不同结构之间的参数传输提供了一种天真算法,采用一种计算上廉价的注入技术(不需要数据),主要目的是从零开始加速神经网络的培训。 这项研究发现,从任何结构中转移知识比开明和Xavier更适合初始化。 总之,所介绍的方法比较快,使它成为古典方法的下降替代。 方法包括:1) 匹配: 预培训模型的层层与目标模型;2) 注入: 将反向转换成一个理想形状。 这项工作通过使用TLI(通过注射转换学习)的分数,比较了当前SOTA结构(IMageNet)的相似性。