与经常网络:逻辑外推法(不过分思考)的终端到终端算法合成 (End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking) - 专知论文

会员服务 ·

0

Performer · 学成 · Networks · 缩放 · 端到端 ·

2022 年 2 月 15 日

End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking

翻译：与经常网络:逻辑外推法(不过分思考)的终端到终端算法合成

Arpit Bansal,Avi Schwarzschild,Eitan Borgnia,Zeyad Emam,Furong Huang,Micah Goldblum,Tom Goldstein

Machine learning systems perform well on pattern matching tasks, but their ability to perform algorithmic or logical reasoning is not well understood. One important reasoning capability is logical extrapolation, in which models trained only on small/simple reasoning problems can synthesize complex algorithms that scale up to large/complex problems at test time. Logical extrapolation can be achieved through recurrent systems, which can be iterated many times to solve difficult reasoning problems. We observe that this approach fails to scale to highly complex problems because behavior degenerates when many iterations are applied -- an issue we refer to as "overthinking." We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten. We also employ a progressive training routine that prevents the model from learning behaviors that are specific to iteration number and instead pushes it to learn behaviors that can be repeated indefinitely. These innovations prevent the overthinking problem, and enable recurrent systems to solve extremely hard logical extrapolation tasks, some requiring over 100K convolutional layers, without overthinking.

翻译：机器学习系统在模式匹配任务方面表现良好,但是它们执行算法或逻辑推理的能力并没有得到很好的理解。一个重要的推理能力是逻辑外推法,在逻辑外推法中,只对小型/简单推理问题进行训练的模型能够综合到在测试时达到大规模/复杂问题的复杂算法。逻辑外推法可以通过经常性系统实现,这些系统可以反复反复使用,解决困难推理问题。我们发现,这种方法无法触及高度复杂的问题,因为许多迭代应用时行为会退化,而我们称之为“过度思考”的问题。我们建议重新召回一个结构,在记忆中保留问题实例的清晰复制件,以免被遗忘。我们还采用渐进式培训程序,防止模型学习与迭代号有关的行为,而是推动它学习可以无限期重复的行为。这些创新可以防止过度思考问题,并使经常性系统能够解决极其困难的逻辑外推术任务,有些需要超过100K的演进层,而不会过度思考。

0

相关内容

Performer

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

随机进程代数模型的Fluid逼近问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂环境下微电铸镍材料的力学性能与微观机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

WO3/TiO2纳米管阵列光电极纳米膜层构建与光催化机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

滇西老厂富银红土型锰矿次生富集机制及40Ar/39Ar年龄

国家自然科学基金

0+阅读 · 2012年12月31日

嵌段共聚物多级自组装模拟分子伴侣的结构与功能

国家自然科学基金

1+阅读 · 2011年12月31日

PI-IBS中TMEM16A介导IL-4对Cajal细胞损伤的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

度序列与图性质及图的t-Pebbling数

国家自然科学基金

0+阅读 · 2011年12月31日

TRPC和ORAI1协同构成钙池操纵的钙通道(SOC)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

钛基合金马氏体相变起因的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos

Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos

Arxiv

0+阅读 · 2022年4月19日

Attention Mechanism based Cognition-level Scene Understanding

Arxiv

0+阅读 · 2022年4月19日

Growing Urban Bicycle Networks

Arxiv

0+阅读 · 2022年4月17日

Resource-Constrained Neural Architecture Search on Tabular Datasets

Arxiv

0+阅读 · 2022年4月15日

Synthesizing Informative Training Samples with GAN

Synthesizing Informative Training Samples with GAN

Arxiv

0+阅读 · 2022年4月15日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Arxiv

13+阅读 · 2018年1月6日

VIP会员

文章信息

相关主题

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos

Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos

Arxiv

0+阅读 · 2022年4月19日

Attention Mechanism based Cognition-level Scene Understanding

Arxiv

0+阅读 · 2022年4月19日

Growing Urban Bicycle Networks

Arxiv

0+阅读 · 2022年4月17日

Resource-Constrained Neural Architecture Search on Tabular Datasets

Arxiv

0+阅读 · 2022年4月15日

Synthesizing Informative Training Samples with GAN

Synthesizing Informative Training Samples with GAN

Arxiv

0+阅读 · 2022年4月15日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Arxiv

13+阅读 · 2018年1月6日

相关基金

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

随机进程代数模型的Fluid逼近问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂环境下微电铸镍材料的力学性能与微观机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

WO3/TiO2纳米管阵列光电极纳米膜层构建与光催化机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

滇西老厂富银红土型锰矿次生富集机制及40Ar/39Ar年龄

国家自然科学基金

0+阅读 · 2012年12月31日

嵌段共聚物多级自组装模拟分子伴侣的结构与功能

国家自然科学基金

1+阅读 · 2011年12月31日

PI-IBS中TMEM16A介导IL-4对Cajal细胞损伤的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

度序列与图性质及图的t-Pebbling数

国家自然科学基金

0+阅读 · 2011年12月31日

TRPC和ORAI1协同构成钙池操纵的钙通道(SOC)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

钛基合金马氏体相变起因的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员