Knowledge embedded in the weights of the artificial neural network can be used to improve the network structure, such as in network compression. However, the knowledge is set up by hand, which may not be very accurate, and relevant information may be overlooked. Inspired by how learning works in the mammalian brain, we mine the knowledge contained in the weights of the neural network toward automatic architecture learning in this paper. We introduce a switcher neural network (SNN) that uses as inputs the weights of a task-specific neural network (called TNN for short). By mining the knowledge contained in the weights, the SNN outputs scaling factors for turning off and weighting neurons in the TNN. To optimize the structure and the parameters of TNN simultaneously, the SNN and TNN are learned alternately under the same performance evaluation of TNN using stochastic gradient descent. We test our method on widely used datasets and popular networks in classification applications. In terms of accuracy, we outperform baseline networks and other structure learning methods stably and significantly. At the same time, we compress the baseline networks without introducing any sparse induction mechanism, and our method, in particular, leads to a lower compression rate when dealing with simpler baselines or more difficult tasks. These results demonstrate that our method can produce a more reasonable structure.
翻译:人工神经网络重力中所包含的知识可用于改善网络结构,例如网络压缩。然而,知识是由手工建立的,可能不十分准确,有关信息可能被忽略。哺乳动物大脑的学习方式激励我们挖掘神经网络重力中所含知识,以便进行自动建筑学习。我们引入了一个开关神经网络(SNN),将任务特定神经网络(简称TNN,简称TNN)的重力作为投入。通过挖掘重量中的知识,SNN产出的缩放因子,使TNNN的神经元发生转机和加权。为了同时优化TNNN、SNNN和TNNN的结构和参数,在对TNN的同一性能评估中,利用微缩梯度脱落,交替学习。我们用广泛使用的数据集和在分类应用中流行的网络测试我们的方法。在准确性方面,我们超越了基准网络和其他结构学习方法的精确性能和显著性能。同时,我们不引入基准网络,而同时不引入任何低度的初始和参数,在更简单的交易中可以展示我们更简单的方法。