Many tasks can be described as compositions over subroutines. Though modern neural networks have achieved impressive performance on both vision and language tasks, we know little about the functions that they implement. One possibility is that neural networks implicitly break down complex tasks into subroutines, implement modular solutions to these subroutines, and compose them into an overall solution to a task -- a property we term structural compositionality. Or they may simply learn to match new inputs to memorized representations, eliding task decomposition entirely. Here, we leverage model pruning techniques to investigate this question in both vision and language, across a variety of architectures, tasks, and pretraining regimens. Our results demonstrate that models oftentimes implement solutions to subroutines via modular subnetworks, which can be ablated while maintaining the functionality of other subroutines. This suggests that neural networks may be able to learn to exhibit compositionality, obviating the need for specialized symbolic mechanisms.
翻译:许多任务可以被描述为子常规的构成。虽然现代神经网络在视觉和语言任务上都取得了令人印象深刻的成绩,但我们对其执行的功能知之甚少。一种可能性是神经网络隐含地将复杂的任务分解为子常规,对这些子常规实施模块化解决方案,并将它们转化为一项任务的全面解决方案 -- -- 我们称之为结构构成的属性。或者它们可能只是学习将新投入与记忆化的表达形式匹配,完全避免任务分解。在这里,我们利用模型调整技术,在各种结构、任务和预培训制度之间,用视觉和语言来调查这一问题。我们的结果表明,模型常常通过模块化子网络来实施子常规化解决方案,这些解决方案可以在保持其他子常规功能的同时被放大。这意味着神经网络也许能够学会展示组成性,不需要专门的象征性机制。