与组成理由解释理由变换器的语文模式脱钩 (Disentangling Reasoning Capabilities from Language Models with Compositional Reasoning Transformers) - 专知论文

会员服务 ·

0

Processing（编程语言） · 控制器 · 语言模型化 · Cognition · MoDELS ·

2022 年 12 月 7 日

Disentangling Reasoning Capabilities from Language Models with Compositional Reasoning Transformers

翻译：与组成理由解释理由变换器的语文模式脱钩

Wanjun Zhong,Tingting Ma,Jiahai Wang,Jian Yin,Tiejun Zhao,Chin-Yew Lin,Nan Duan

from arxiv, 12 pages

This paper presents ReasonFormer, a unified reasoning framework for mirroring the modular and compositional reasoning process of humans in complex decision-making. Inspired by dual-process theory in cognitive science, the representation module (automatic thinking) and reasoning modules (controlled thinking) are decoupled to capture different levels of cognition. Upon the top of the representation module, the pre-trained reasoning modules are modular and professional in specific and fundamental reasoning skills (e.g., logic, simple QA, etc). To mimic the controlled compositional thinking process, different reasoning modules are dynamically activated and composed in both parallel and cascaded manners to control what reasoning skills are activated and how deep the reasoning process will be reached to solve the current problems. The unified reasoning framework solves multiple tasks with a single model, and is trained and inferred in an end-to-end manner. Evaluated on 11 datasets requiring different reasoning skills and complexity, ReasonFormer demonstrates substantial performance boosts, revealing the compositional reasoning ability. Few-shot experiments exhibit better generalization ability by learning to compose pre-trained skills for new tasks with limited data, and decoupling the representation module and the reasoning modules. Further analysis shows the modularity of reasoning modules as different tasks activate distinct reasoning skills at different reasoning depths.

翻译：本文介绍了在复杂的决策中反映人类模块化和构成推理过程的统一推理框架 " 理由Former " 。在认知科学的双重过程理论的启发下,代表模块(自动思维)和推理模块(控制思维)被拆开,以捕捉不同程度的认知。在代表模块的顶端,预先培训的推理模块在具体和基本推理技能(如逻辑、简单质量A等)方面是模块化和专业化的。模拟受控的构思思维过程,不同推理模块被动态激活,以平行和分层的方式组成,以控制哪些推理技能被激活,以及如何深入推理进程解决当前问题。统一推理框架用单一模型解决多种任务,并以端对端方式进行培训和推断。对需要不同推理技能和复杂性的11个数据集进行了评估(如逻辑、简单QA等),EricFormer展示了实质性的推理推理能力。很少的实验通过学习如何以有限的数据和分层推理方式配置新任务的预演技能,展示了更好的一般化能力。不同的推理,在不同的推理模型中进一步展示了不同的推理。

0

相关内容

Processing（编程语言）

Processing（编程语言）

Processing 是一门开源编程语言和与之配套的集成开发环境（IDE）的名称。Processing 在电子艺术和视觉设计社区被用来教授编程基础，并运用于大量的新媒体和互动艺术作品中。

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

集值优化问题的逼近解及二阶最优性条件

国家自然科学基金

0+阅读 · 2014年12月31日

流感病毒NS基因功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

广义Markov跳变系统的非同步控制

国家自然科学基金

0+阅读 · 2013年12月31日

基于资源变迁回路的柔性制造系统死锁控制方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于连锁和关联分析定位玉米瘤黑粉病抗性基因

国家自然科学基金

0+阅读 · 2013年12月31日

冷原子系统中的可积模型

国家自然科学基金

0+阅读 · 2013年12月31日

CAP70经调节PTEN磷酸化影响肾癌恶性表型的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于侧向非平稳约束动力学系统的车辆附着极限态稳定性控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

ACAP4在胃酸分泌中的功能及调节分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Improving Certified Robustness via Statistical Learning with Logical Reasoning

Improving Certified Robustness via Statistical Learning with Logical Reasoning

Arxiv

0+阅读 · 2023年2月9日

Lightweight Transformers for Clinical Natural Language Processing

Arxiv

0+阅读 · 2023年2月9日

Revisiting Pre-training in Audio-Visual Learning

Arxiv

0+阅读 · 2023年2月7日

Protecting Language Generation Models via Invisible Watermarking

Arxiv

0+阅读 · 2023年2月6日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Neural Collaborative Reasoning

Arxiv

13+阅读 · 2021年5月3日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Compositional GAN: Learning Conditional Image Composition

Compositional GAN: Learning Conditional Image Composition

Arxiv

31+阅读 · 2018年7月19日

Variational Knowledge Graph Reasoning

Arxiv

15+阅读 · 2018年4月5日

VIP会员

文章信息

相关主题

Processing（编程语言）

语言模型化

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《现代化战役与作战规划：陆军的未来之路》最新101页

《理解Link 16：军事通信的支柱——探索战术数据交换网络》

《人工智能在军事行动作战规划过程中的应用可能性》

《洞穴环境无线电传播建模》147页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Improving Certified Robustness via Statistical Learning with Logical Reasoning

Improving Certified Robustness via Statistical Learning with Logical Reasoning

Arxiv

0+阅读 · 2023年2月9日

Lightweight Transformers for Clinical Natural Language Processing

Arxiv

0+阅读 · 2023年2月9日

Revisiting Pre-training in Audio-Visual Learning

Arxiv

0+阅读 · 2023年2月7日

Protecting Language Generation Models via Invisible Watermarking

Arxiv

0+阅读 · 2023年2月6日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Neural Collaborative Reasoning

Arxiv

13+阅读 · 2021年5月3日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Compositional GAN: Learning Conditional Image Composition

Compositional GAN: Learning Conditional Image Composition

Arxiv

31+阅读 · 2018年7月19日

Variational Knowledge Graph Reasoning

Arxiv

15+阅读 · 2018年4月5日

相关基金

集值优化问题的逼近解及二阶最优性条件

国家自然科学基金

0+阅读 · 2014年12月31日

流感病毒NS基因功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

广义Markov跳变系统的非同步控制

国家自然科学基金

0+阅读 · 2013年12月31日

基于资源变迁回路的柔性制造系统死锁控制方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于连锁和关联分析定位玉米瘤黑粉病抗性基因

国家自然科学基金

0+阅读 · 2013年12月31日

冷原子系统中的可积模型

国家自然科学基金

0+阅读 · 2013年12月31日

CAP70经调节PTEN磷酸化影响肾癌恶性表型的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于侧向非平稳约束动力学系统的车辆附着极限态稳定性控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

ACAP4在胃酸分泌中的功能及调节分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员