The cornerstone of neural algorithmic reasoning is the ability to solve algorithmic tasks, especially in a way that generalises out of distribution. While recent years have seen a surge in methodological improvements in this area, they mostly focused on building specialist models. Specialist models are capable of learning to neurally execute either only one algorithm or a collection of algorithms with identical control-flow backbone. Here, instead, we focus on constructing a generalist neural algorithmic learner -- a single graph neural network processor capable of learning to execute a wide range of algorithms, such as sorting, searching, dynamic programming, path-finding and geometry. We leverage the CLRS benchmark to empirically show that, much like recent successes in the domain of perception, generalist algorithmic learners can be built by "incorporating" knowledge. That is, it is possible to effectively learn algorithms in a multi-task manner, so long as we can learn to execute them well in a single-task regime. Motivated by this, we present a series of improvements to the input representation, training regime and processor architecture over CLRS, improving average single-task performance by over 20% from prior art. We then conduct a thorough ablation of multi-task learners leveraging these improvements. Our results demonstrate a generalist learner that effectively incorporates knowledge captured by specialist models.
翻译:神经算法推理的基石是解决算法任务的能力, 特别是以一般分布的方式解决算法任务的能力。 虽然近年来在这方面的方法改进激增, 但是它们大多侧重于建立专家模型。 专家模型能够学习神经只执行一种算法, 或者收集具有相同的控制流主干线的算法。 相反, 我们的重点放在构建一个通用的神经算法学习者 -- -- 一个单一的图形神经网络处理器, 能够学习执行广泛的算法, 比如排序、 搜索、 动态编程、 路由和几何。 我们利用 CLRS 基准的经验性地显示, 与最近在认知领域的成功一样, 普通算法学习者可以通过“ 融合” 知识来建立神经算法。 也就是说, 我们有可能以多功能的方式有效地学习算法算法, 只要我们可以学会在一个单一任务制度下执行这些算法。 我们为此提出了一系列改进意见, 培训制度和处理系统架构。 我们从CLRS 模型上改进了一个平均的单项算法学习者, 有效地展示了我们过去20个专业级的学习者 。