战术零:学习深强化学习,从Scratch中校验理论 (TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning) - 专知论文

会员服务 ·

0

学成 · Automator · Processing（编程语言） · 深度强化学习 · INTERACT ·

2021 年 6 月 17 日

TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning

翻译：战术零:学习深强化学习,从Scratch中校验理论

Minchao Wu,Michael Norrish,Christian Walder,Amir Dezfouli

We propose a novel approach to interactive theorem-proving (ITP) using deep reinforcement learning. The proposed framework is able to learn proof search strategies as well as tactic and arguments prediction in an end-to-end manner. We formulate the process of ITP as a Markov decision process (MDP) in which each state represents a set of potential derivation paths. This structure allows us to introduce a novel backtracking mechanism which enables the agent to efficiently discard (predicted) dead-end derivations and restart from promising alternatives. We implement the framework in the HOL4 theorem prover. Experimental results show that the framework outperforms existing automated theorem provers (i.e., hammers) available in HOL4 when evaluated on unseen problems. We further elaborate the role of key components of the framework using ablation studies.

翻译：我们建议采用新的方法,利用深层强化学习来进行互动理论验证(ITP),拟议的框架能够学习证据搜索战略以及战术和理论预测,我们把IPP进程设计成一个马可夫决策程序(MDP),其中每个国家代表一系列潜在的衍生途径。这一结构使我们能够引入一个新的回溯跟踪机制,使代理商能够有效地丢弃(预测的)死端产物,并重新启动有希望的替代品。我们在HOL4理论验证器中实施该框架。实验结果显示,该框架在评估隐蔽问题时,超过了HOL4中现有的自动理论验证器(即锤子)。我们通过减缩研究进一步阐述了框架关键组成部分的作用。

0

相关内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

专知会员服务

209+阅读 · 2020年7月5日

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

专知会员服务

175+阅读 · 2020年5月10日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

专知会员服务

32+阅读 · 2019年10月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

专知

5+阅读 · 2018年1月19日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Confidence Adaptive Regularization for Deep Learning with Noisy Labels

Arxiv

0+阅读 · 2021年8月18日

Fractional Transfer Learning for Deep Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2021年8月14日

Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning

Arxiv

7+阅读 · 2019年2月8日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

Task-Free Continual Learning

Arxiv

6+阅读 · 2018年12月10日

Reinforcement Learning with Perturbed Rewards

Arxiv

4+阅读 · 2018年10月5日

Deep Learning

Arxiv

6+阅读 · 2018年8月3日

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach

Arxiv

11+阅读 · 2018年7月12日

Experience-driven Networking: A Deep Reinforcement Learning based Approach

Arxiv

9+阅读 · 2018年1月17日

Multi-Task Learning with Labeled and Unlabeled Tasks

Arxiv

3+阅读 · 2017年6月8日

VIP会员

文章信息

相关主题

Processing（编程语言）

深度强化学习

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

专知会员服务

209+阅读 · 2020年7月5日

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

专知会员服务

175+阅读 · 2020年5月10日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

专知会员服务

32+阅读 · 2019年10月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

专知

5+阅读 · 2018年1月19日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Confidence Adaptive Regularization for Deep Learning with Noisy Labels

Arxiv

0+阅读 · 2021年8月18日

Fractional Transfer Learning for Deep Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2021年8月14日

Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning

Arxiv

7+阅读 · 2019年2月8日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

Task-Free Continual Learning

Arxiv

6+阅读 · 2018年12月10日

Reinforcement Learning with Perturbed Rewards

Arxiv

4+阅读 · 2018年10月5日

Deep Learning

Arxiv

6+阅读 · 2018年8月3日

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach

Arxiv

11+阅读 · 2018年7月12日

Experience-driven Networking: A Deep Reinforcement Learning based Approach

Arxiv

9+阅读 · 2018年1月17日

Multi-Task Learning with Labeled and Unlabeled Tasks

Arxiv

3+阅读 · 2017年6月8日

微信扫码咨询专知VIP会员