与神经解释器的动态推断 (Dynamic Inference with Neural Interpreters)

Modern neural network architectures can leverage large amounts of data to generalize well within the training distribution. However, they are less capable of systematic generalization to data drawn from unseen but related distributions, a feat that is hypothesized to require compositional reasoning and reuse of knowledge. In this work, we present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules, which we call \emph{functions}. Inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. The proposed architecture can flexibly compose computation along width and depth, and lends itself well to capacity extension after training. To demonstrate the versatility of Neural Interpreters, we evaluate it in two distinct settings: image classification and visual abstract reasoning on Raven Progressive Matrices. In the former, we show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner. In the latter, we find that Neural Interpreters are competitive with respect to the state-of-the-art in terms of systematic generalization

翻译：现代神经网络结构能够利用大量数据,在培训分布范围内进行广泛推广。然而,它们更没有能力系统地对从无形但相关分布中提取的数据进行系统化的概括,而这种功能被假定为需要构成推理和知识再利用的精华。在这项工作中,我们提出神经解释器,这种结构将自控网络中的推论作为一种模块系统,我们称之为“emph{函数}”;对模型的投入通过一个功能序列,以从尾到尾学习的方式进行。拟议的结构可以灵活地按照宽和深度进行计算,并在培训后很好地扩展能力。为了展示神经解释器的多功能,我们用两种不同的环境来评估它:图像分类和对Rave 进步矩阵的视觉抽象推理。在前一种情况下,我们用较少的参数来显示神经解释器在与视觉变异器相当地工作,同时可以以抽样有效的方式转移到一项新的任务。在后一种情况下,我们发现神经解释器在系统化方面具有竞争力,在系统化的状态上具有竞争力。