Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel methodology to identify layers that decrease the test accuracy of trained models. Conflicting layers are detected as early as the beginning of training. In the worst-case scenario, we prove that such a layer could lead to a network that cannot be trained at all. A theoretical analysis is provided on what is the origin of those layers that result in a lower overall network performance, which is complemented by our extensive empirical evaluation. More precisely, we identified those layers that worsen the performance because they would produce what we name conflicting training bundles. We will show that around 60% of the layers of trained residual networks can be completely removed from the architecture with no significant increase in the test-error. We will further present a novel neural-architecture-search (NAS) algorithm that identifies conflicting layers at the beginning of the training. Architectures found by our auto-tuning algorithm achieve competitive accuracy values when compared against more complex state-of-the-art architectures, while drastically reducing memory consumption and inference time for different computer vision tasks. The source code is available on https://github.com/peerdavid/conflicting-bundles
翻译:设计神经网络架构是一项挑战性的任务, 并且知道哪些具体层次的模型必须调整以改善性能, 几乎是一个谜。 在本文中, 我们引入了一种新的方法, 以辨别降低经过训练的模型测试准确度的层次。 冲突层早在培训开始时就被检测出来。 在最坏的情况下, 我们证明这样的层次可能导致一个根本无法培训的网络。 提供了理论分析, 分析这些层次的起源导致整个网络性能较低, 并辅之以我们广泛的实证评估。 更确切地说, 我们发现哪些层次的性能恶化, 因为它们会产生我们命名的相互冲突的训练包。 我们将显示, 大约60%经过训练的残余网络层可以完全从结构中移除, 而测试- 仪中没有任何显著的增加。 我们将进一步展示一个新的神经- 结构- 搜索算法, 在培训开始时确定相互冲突的层次。 我们的自动调算法发现的结构, 与更复杂的状态- 艺术架构相比, 具有竞争性的准确性值, 而同时大幅度减少记忆的消耗量 和在不同的计算机视野源中可以使用 。