你能学习一个算法吗? 用经常网络从易问题到难问题的一般化 (Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks)

Deep neural networks are powerful machines for visual pattern recognition, but reasoning tasks that are easy for humans may still be difficult for neural models. Humans possess the ability to extrapolate reasoning strategies learned on simple problems to solve harder examples, often by thinking for longer. For example, a person who has learned to solve small mazes can easily extend the very same search techniques to solve much larger mazes by spending more time. In computers, this behavior is often achieved through the use of algorithms, which scale to arbitrarily hard problem instances at the cost of more computation. In contrast, the sequential computing budget of feed-forward neural networks is limited by their depth, and networks trained on simple problems have no way of extending their reasoning to accommodate harder problems. In this work, we show that recurrent networks trained to solve simple problems with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference. We demonstrate this algorithmic behavior of recurrent networks on prefix sum computation, mazes, and chess. In all three domains, networks trained on simple problem instances are able to extend their reasoning abilities at test time simply by "thinking for longer."

翻译：深心神经网络是视觉模式识别的强大机器,但对于神经模型来说,对人来说容易的推理任务可能仍然是困难的。人类有能力对简单问题所学的推理策略进行外推,解决更难的例子,通常要用更长的时间来思考。例如,学会解决小迷宫的人可以很容易地推广同样的搜索技术,通过花更多的时间来解决大得多的迷宫。在计算机中,这种行为往往是通过使用算法来实现的,这种算法可以以更多的计算为代价任意地将问题推到更棘手的情况。相比之下,进向神经网络的顺序计算预算受到深度的限制,而经过简单问题培训的网络无法通过扩展其推理能力来适应更困难的问题。在这项工作中,我们证明经过训练的解决简单问题的经常性网络可以通过在推理过程中进行更多的重复来解决更复杂的问题。我们展示了在前缀和计算、迷宫和下棋方面经常网络的算法行为。在所有三个领域,经过训练的简单问题案例的网络能够仅仅通过“思考更长的时间”来扩大其推理能力。