The question of what makes a data distribution suitable for deep learning is a fundamental open problem. Focusing on locally connected neural networks (a prevalent family of architectures that includes convolutional and recurrent neural networks as well as local self-attention models), we address this problem by adopting theoretical tools from quantum physics. Our main theoretical result states that a certain locally connected neural network is capable of accurate prediction over a data distribution if and only if the data distribution admits low quantum entanglement under certain canonical partitions of features. As a practical application of this result, we derive a preprocessing method for enhancing the suitability of a data distribution to locally connected neural networks. Experiments with widespread models over various datasets demonstrate our findings. We hope that our use of quantum entanglement will encourage further adoption of tools from physics for formally reasoning about the relation between deep learning and real-world data.
翻译:深度学习所需的数据分布优劣判定问题是一个基本的开放性问题。本文通过采用量子物理的理论工具,着重研究局部连接神经网络(一个常用的神经网络架构家族,包括卷积神经网络、循环神经网络以及本地自注意力模型等)的数据适用性问题。我们的主要理论结果表明,如果数据分布在某些特定的特征分区下具有低的量子纠缠度,则某些局部连接神经网络就能够进行准确的预测,反之则不行。作为该结果的一个实际应用,我们得出了一种数据预处理方法,用于提高数据分布对局部连接神经网络的适用性。在各种数据集上实验证明了我们的发现。我们希望借助于量子纠缠等物理工具,进一步推广物理学在形式上推理深度学习与现实世界数据之间的关系。