神经网络和乔姆斯基等级制度 (Neural Networks and the Chomsky Hierarchy)

Grégoire Delétang,Anian Ruoss,Jordi Grau-Moya,Tim Genewein,Li Kevin Wenliang,Elliot Catt,Chris Cundy,Marcus Hutter,Shane Legg,Joel Veness,Pedro A. Ortega

Reliable generalization lies at the heart of safe ML and AI. However, understanding when and how neural networks generalize remains one of the most important unsolved problems in the field. In this work, we conduct an extensive empirical study (10250 models, 15 tasks) to investigate whether insights from the theory of computation can predict the limits of neural network generalization in practice. We demonstrate that grouping tasks according to the Chomsky hierarchy allows us to forecast whether certain architectures will be able to generalize to out-of-distribution inputs. This includes negative results where even extensive amounts of data and training time never lead to any non-trivial generalization, despite models having sufficient capacity to fit the training data perfectly. Our results show that, for our subset of tasks, RNNs and Transformers fail to generalize on non-regular tasks, LSTMs can solve regular and counter-language tasks, and only networks augmented with structured memory (such as a stack or memory tape) can successfully generalize on context-free and context-sensitive tasks.

翻译：可靠的概括化是安全 ML 和 AI 的核心。但是,理解何时和如何将神经网络普遍化仍然是该领域最重要的尚未解决的问题之一。在这项工作中,我们进行了广泛的实证研究(10250个模型,15项任务),以调查从计算理论中得出的洞察力能否预测实践中神经网络一般化的局限性。我们证明,根据乔姆斯基等级划分的分组任务能够让我们预测某些结构是否能够笼统化分配外的投入。这包括负面结果,即使大量的数据和培训时间也永远不会导致任何非三重的概括化,尽管模型有足够的能力完全适应培训数据。我们的结果显示,对于我们的任务组别而言,RNN和变换器无法概括非常规任务,LSTMs能够解决常规和反语言的任务,只有结构上的记忆(如堆放或记忆带)才能扩展网络才能成功地概括无背景和对背景敏感的任务。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日