A caveat to many applications of the current Deep Learning approach is the need for large-scale data. One improvement suggested by Kolmogorov Complexity results is to apply the minimum description length principle with computationally universal models. We study the potential gains in sample efficiency that this approach can bring in principle. We use polynomial-time Turing machines to represent computationally universal models and Boolean circuits to represent Artificial Neural Networks (ANNs) acting on finite-precision digits. Our analysis unravels direct links between our question and Computational Complexity results. We provide lower and upper bounds on the potential gains in sample efficiency between the MDL applied with Turing machines instead of ANNs. Our bounds depend on the bit-size of the input of the Boolean function to be learned. Furthermore, we highlight close relationships between classical open problems in Circuit Complexity and the tightness of these.
翻译:对于目前深层学习方法的许多应用而言,需要大规模的数据。科尔莫戈洛夫复杂度的结果建议的一项改进是,在计算通用模型中应用最低描述长度原则。我们研究了这一方法原则上可以带来的抽样效率方面的潜在收益。我们使用多米时图灵机代表计算通用模型和布林电路,代表以有限精度数字计算的人工神经网络。我们的分析揭示了我们的问题与计算复杂度结果之间的直接联系。我们提供了与图灵机而不是与非非非非内特机应用的MDL样本效率潜在收益的上下限。我们的界限取决于将要学习的布林函数输入量的位大小。此外,我们强调电路复杂度中传统的开放问题与这些功能的紧凑性之间的密切关系。