This paper presents ReasonFormer, a unified reasoning framework for mirroring the modular and compositional reasoning process of humans in complex decision making. Inspired by dual-process theory in cognitive science, the representation module (automatic thinking) and reasoning modules (controlled thinking) are disentangled to capture different levels of cognition. Upon the top of the representation module, the pre-trained reasoning modules are modular and expertise in specific and fundamental reasoning skills (e.g., logic, simple QA, etc). To mimic the controlled compositional thinking process, different reasoning modules are dynamically activated and composed in both parallel and cascaded manners to control what reasoning skills are activated and how deep the reasoning process will be reached to solve the current problems. The unified reasoning framework solves multiple tasks with a single model,and is trained and inferred in an end-to-end manner. Evaluated on 11 datasets requiring different reasoning skills and complexity, ReasonFormer demonstrates substantial performance boosts, revealing the compositional reasoning ability. Few-shot experiments exhibit better generalization ability by learning to compose pre-trained skills for new tasks with limited data,and decoupling the representation module and the reasoning modules. Further analysis shows the modularity of reasoning modules as different tasks activate distinct reasoning skills at different reasoning depths.
翻译:本文介绍了 " 理由Former ",这是在复杂的决策中反映人类模块化和构成推理过程的统一推理框架。在认知科学的双重过程理论的启发下,代表模块(自动思维)和推理模块(控制思维)被分解,以捕捉不同程度的认知。在代表模块的顶端,预先培训的推理模块是模块化的,具有具体和基本推理技能(如逻辑、简单质量A等)方面的专门知识。为了模仿受控的构思思维过程,不同推理模块被动态地激活,以平行和分层的方式组成,以控制哪些推理技能被激活,以及如何深入推理进程以解决当前问题。统一推理框架用单一模型解决多种任务,并以端到端的方式进行培训和推断。根据11个数据集需要不同的推理技能和复杂性(如逻辑、简单质量A等)进行了评估。为了模拟受控的推理能力,少见的实验显示了更好的概括能力,通过学习以有限的数据、分层推理为新的任务配置预先技能,以及不同推理的推理模型展示了不同的推理。