Sparsity in the structure of Neural Networks can lead to less energy consumption, less memory usage, faster computation times on convenient hardware, and automated machine learning. If sparsity gives rise to certain kinds of structure, it can explain automatically obtained features during learning. We provide insights into experiments in which we show how sparsity can be achieved through prior initialization, pruning, and during learning, and answer questions on the relationship between the structure of Neural Networks and their performance. This includes the first work of inducing priors from network theory into Recurrent Neural Networks and an architectural performance prediction during a Neural Architecture Search. Within our experiments, we show how magnitude class blinded pruning achieves 97.5% on MNIST with 80% compression and re-training, which is 0.5 points more than without compression, that magnitude class uniform pruning is significantly inferior to it and how a genetic search enhanced with performance prediction achieves 82.4% on CIFAR10. Further, performance prediction for Recurrent Networks learning the Reber grammar shows an $R^2$ of up to 0.81 given only structural information.
翻译:神经网络结构的分化可能导致能量消耗减少,记忆用量减少,方便硬件和自动机器学习的计算时间缩短。 如果宽度产生某种结构,它可以解释学习期间自动获得的特征。我们提供实验的洞察力,展示我们如何通过预先初始化、裁剪和学习期间实现宽度,回答关于神经网络结构及其性能之间关系的问题。这包括首次从网络理论中引出前科进入常态神经网络,以及在神经结构搜索期间进行建筑性能预测。在实验中,我们展示了程度级失明的修剪在MNIST上达到97.5 %, 其压缩和再训练比不压缩多0.5个百分点, 规模级统一划线大大低于它, 并且通过业绩预测强化的基因搜索在CIFAR10 上达到82.4%。此外,学习Reber Grrammar的经常网络的性能预测显示,仅提供结构信息达0.81美元。