Neural networks have succeeded in many reasoning tasks. Empirically, these tasks require specialized network structures, e.g., Graph Neural Networks (GNNs) perform well on many such tasks, but less structured networks fail. Theoretically, there is limited understanding of why and when a network structure generalizes better than others, although they have equal expressive power. In this paper, we develop a framework to characterize which reasoning tasks a network can learn well, by studying how well its computation structure aligns with the algorithmic structure of the relevant reasoning process. We formally define this algorithmic alignment and derive a sample complexity bound that decreases with better alignment. This framework offers an explanation for the empirical success of popular reasoning models, and suggests their limitations. As an example, we unify seemingly different reasoning tasks, such as intuitive physics, visual question answering, and shortest paths, via the lens of a powerful algorithmic paradigm, dynamic programming (DP). We show that GNNs align with DP and thus are expected to solve these tasks. On several reasoning tasks, our theory is supported by empirical results.
翻译:神经网络在许多推理任务中取得了成功。 这些任务通常需要专门的网络结构,例如,图形神经网络(GNNS)在很多此类任务中表现良好,但结构化较少的网络失败。理论上,对于为什么网络结构比其他网络更概括化,以及当网络结构比其他网络更普遍时,理解有限,尽管网络具有同等的表达力。在本文件中,我们通过研究网络的计算结构与相关推理过程的算法结构的匹配程度,我们制定了一个框架来说明哪些推理任务可以很好地学习。我们正式定义了这种算法结构,并得出了一种精细的复杂度,并有较好的校准。这个框架为大众推理模型的经验成功提供了解释,并提出了它们的局限性。举例来说,我们通过一个强大的算法范式、动态编程(DP)的透镜,将看起来不同的推理任务统一起来。我们理论得到了实证结果的支持。