Memory-augmented neural networks (MANNs) can solve algorithmic tasks like sorting. However, they often do not generalize to lengths of input sequences not seen in the training phase. Therefore, we introduce two approaches constraining the state-space of the network controller to improve the generalization to out-of-distribution-sized input sequences: state compression and state regularization. We show that both approaches can improve the generalization capability of a particular type of MANN, the differentiable neural computer (DNC), and compare our approaches to a stateful and a stateless controller on a set of algorithmic tasks. Furthermore, we show that especially the combination of both approaches can enable a pre-trained DNC to be extended post hoc with a larger memory. Thus, our introduced approaches allow to train a DNC using shorter input sequences and thus save computational resources. Moreover, we observed that the capability for generalization is often accompanied by loop structures in the state-space, which could correspond to looping constructs in algorithms.
翻译:内存增强的神经网络( MANNs) 能够解决排序等算法任务。 但是, 它们通常不会对培训阶段所没有的输入序列的长度进行概括化。 因此, 我们引入了两种限制网络控制器国家空间的方法, 以改善对超出分配规模的输入序列的概括化。 国家压缩和国家规范化。 我们表明两种方法都能提高特定类型 MANN( 差异型神经计算机) 的概括化能力, 比较我们对一套算法任务的国家控制器和无国籍控制器( DNC) 的处理方法。 此外, 我们还表明, 这两种方法的结合可以让经过预先训练的 DNC 以更大的内存来扩展 。 因此, 我们引入的方法可以使用较短的输入序列来培训 DNC, 从而节省计算资源。 此外, 我们观察到, 常规化的能力往往伴随着国家空间的循环结构, 与算法的循环结构相对应 。