The weight matrix (WM) of a neural network (NN) is its program. The programs of many traditional NNs are learned through gradient descent in some error function, then remain fixed. The WM of a self-referential NN, however, can keep rapidly modifying all of itself during runtime. In principle, such NNs can meta-learn to learn, and meta-meta-learn to meta-learn to learn, and so on, in the sense of recursive self-improvement. While NN architectures potentially capable of implementing such behaviour have been proposed since the '90s, there have been few if any practical studies. Here we revisit such NNs, building upon recent successes of fast weight programmers and closely related linear Transformers. We propose a scalable self-referential WM (SRWM) that learns to use outer products and the delta update rule to modify itself. We evaluate our SRWM in supervised few-shot learning and in multi-task reinforcement learning with procedurally generated game environments. Our experiments demonstrate both practical applicability and competitive performance of the proposed SRWM. Our code is public.
翻译:神经网络(NN)的重量矩阵(WM)是其程序。许多传统的NN公司的程序是通过某些错误功能的梯度下降学习的,然后保持固定。但是,自优的NNN公司的WM可以在运行期间迅速修改所有内容。原则上,这种NNN公司可以自学自读,而元-甲基-链(MMM)可以自学,等等,从循环自我改进的意义上来说。虽然自90年代以来就提出了可能实施这种行为的NNN公司结构,但实际研究却很少。我们在这里重新审视这类NNN公司,在快速重程序程序员和密切相关的线性变形器最近的成功经验基础上进行。我们提出了一个可扩展的自优WM公司(SRWM),学会使用外源产品和三角洲更新规则进行自我改造。我们用监督的微小的学习和多功能强化学习来评估我们的SRWM公司。我们的代码是公开的。我们实验显示SRWM公司的实用性和竞争性表现。