That shared features between train and test data are required for generalisation in artificial neural networks has been a common assumption of both proponents and critics of these models. Here, we show that convolutional architectures avoid this limitation by applying them to two well known challenges, based on learning the identity function and learning rules governing sequences of words. In each case, successful performance on the test set requires generalising to features that were not present in the training data, which is typically not feasible for standard connectionist models. However, our experiments demonstrate that neural networks can succeed on such problems when they incorporate the weight sharing employed by convolutional architectures. In the image processing domain, such architectures are intended to reflect the symmetry under spatial translations of the natural world that such images depict. We discuss the role of symmetry in the two tasks and its connection to generalisation.
翻译:人造神经网络的普及需要火车和测试数据之间的共同特征,这是这些模型的支持者和批评者的共同假设。在这里,我们表明,进化型结构通过在学习身份功能和关于文字序列的学习规则的基础上,将这些特征应用于两个众所周知的挑战,从而避免了这种局限性。在每种情况下,测试集的成功性能都需要对培训数据中不存在的特征进行概括化,而对于标准连接模型来说,这些特征通常不可行。然而,我们的实验表明,当神经网络纳入共变型结构使用的权重共享时,它们能够成功地解决这些问题。在图像处理领域,这些结构旨在反映这些图像所描述的自然世界空间翻译中的对称性。我们讨论了对称性在两个任务中的作用及其与概括性的联系。