Multilayer neural networks have achieved superhuman performance in many artificial intelligence applications. However, their black-box nature obscures the underlying mechanism for transforming input data into labels throughout all layers, thus hindering architecture design for new tasks and interpretation for high-stakes decision makings. We addressed this problem by introducing a precise law that governs how real-world deep neural networks separate data according to their class membership from the bottom layers to the top layers in classification problems. This law shows that each layer roughly improves a certain measure of data separation by an \textit{equal} multiplicative factor. This law manifests in modern architectures such as AlexNet, VGGNet, and ResNet in the late phase of training. This law together with the perspective of data separation offers practical guidelines for designing network architectures, improving model robustness and out-of-sample performance during training, as well as interpreting deep learning predictions.
翻译:多层神经网络在许多人工智能应用中取得了超人性性能。然而,它们的黑箱性质掩盖了将输入数据转换成所有层次标签的基本机制,从而阻碍了新任务的设计以及高层决策解释。我们通过引入精确的法律来解决这个问题,该法律规范了真实世界深层神经网络如何根据分类问题中从低层到顶层的类别成员情况将数据区分开来。这一法律表明,每一层通过一个多复制性因素大致改进了某种程度的数据分离。这一法律在培训的后期阶段表现在亚历克斯网、VGGNet和ResNet等现代结构中。这一法律与数据分离的观点一起为设计网络结构、提高模型的稳健性和在培训过程中的超模化性能以及解释深度学习预测提供了实用的指导方针。