持续控制问题学习解释、高执行率政策 (Learning Interpretable, High-Performing Policies for Continuous Control Problems) - 专知论文

会员服务 ·

0

Continuity · 学成 · 控制器 · 优化器 · Performer ·

2022 年 2 月 4 日

Learning Interpretable, High-Performing Policies for Continuous Control Problems

翻译：持续控制问题学习解释、高执行率政策

Rohan Paleja,Yaru Niu,Andrew Silva,Chace Ritchie,Sugju Choi,Matthew Gombolay

Gradient-based approaches in reinforcement learning (RL) have achieved tremendous success in learning policies for continuous control problems. While the performance of these approaches warrants real-world adoption in domains, such as in autonomous driving and robotics, these policies lack interpretability, limiting deployability in safety-critical and legally-regulated domains. Such domains require interpretable and verifiable control policies that maintain high performance. We propose Interpretable Continuous Control Trees (ICCTs), a tree-based model that can be optimized via modern, gradient-based, RL approaches to produce high-performing, interpretable policies. The key to our approach is a procedure for allowing direct optimization in a sparse decision-tree-like representation. We validate ICCTs against baselines across six domains, showing that ICCTs are capable of learning interpretable policy representations that parity or outperform baselines by up to 33$\%$ in autonomous driving scenarios while achieving a $300$x-$600$x reduction in the number of policy parameters against deep learning baselines.

翻译：强化学习(RL)的渐进式方法在学习持续控制问题的政策方面取得了巨大成功。虽然这些方法的绩效需要在诸如自主驾驶和机器人等领域实际采用,但这些政策缺乏可解释性,限制了安全关键和受法律管制领域的可部署性。这些领域需要可解释和可核查的控制政策,以保持高性能。我们提议了可解释和可核查的连续控制树(ICCTs),这是一种以树为基础的模式,可以通过现代、梯度和RL方式优化,以产生高绩效和可解释的政策。我们方法的关键是允许在稀有的决策型代表处直接优化程序。我们对照六个领域的基线验证了国际电算技术中心。我们证明国际电算技术中心能够学习可解释的政策表述,即在自主驾驶情景中,平等或优于基线,最高为33美元,同时根据深层次学习基线,使政策参数减少300美元-600美元。

0

相关内容

Continuity

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【CMU】最新深度学习课程， Introduction to Deep Learning

【CMU】最新深度学习课程， Introduction to Deep Learning

专知会员服务

38+阅读 · 2020年9月12日

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

专知会员服务

67+阅读 · 2019年11月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

MEMS声衬及分布式阻抗控制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于细胞凋亡的MR/PET分子影像对非小细胞肺癌疗效评估的研究

国家自然科学基金

0+阅读 · 2013年12月31日

自适应移动Kriging插值响应面可靠性分析方法及其应用研究

国家自然科学基金

1+阅读 · 2013年12月31日

Pictet–Spengler类反应机理的理论研究和新反应设计

国家自然科学基金

0+阅读 · 2013年12月31日

高维统计模型中的稳健推断及其应用

国家自然科学基金

1+阅读 · 2012年12月31日

实值多变量维数约简研究及应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于事件的强化学习及其在群机器人优化控制中的应用

国家自然科学基金

3+阅读 · 2012年12月31日

云南双江—沧源新近纪植物多样性与古环境

国家自然科学基金

0+阅读 · 2011年12月31日

高速移动条件下无线宽带网络资源管理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

NOX在UVA诱导的肥大细胞胞浆钙振荡中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

Memory-Constrained Policy Optimization

Arxiv

0+阅读 · 2022年4月20日

Energy-Based Continuous Inverse Optimal Control

Arxiv

0+阅读 · 2022年4月19日

Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models

Arxiv

1+阅读 · 2022年4月18日

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Learning-Based Approaches for Graph Problems: A Survey

Arxiv

1+阅读 · 2022年4月17日

Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Arxiv

0+阅读 · 2022年4月15日

Selecting Continuous Life-Like Cellular Automata for Halting Unpredictability: Evolving for Abiogenesis

Selecting Continuous Life-Like Cellular Automata for Halting Unpredictability: Evolving for Abiogenesis

Arxiv

0+阅读 · 2022年4月15日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月15日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【CMU】最新深度学习课程， Introduction to Deep Learning

【CMU】最新深度学习课程， Introduction to Deep Learning

专知会员服务

38+阅读 · 2020年9月12日

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

专知会员服务

67+阅读 · 2019年11月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Memory-Constrained Policy Optimization

Arxiv

0+阅读 · 2022年4月20日

Energy-Based Continuous Inverse Optimal Control

Arxiv

0+阅读 · 2022年4月19日

Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models

Arxiv

1+阅读 · 2022年4月18日

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Learning-Based Approaches for Graph Problems: A Survey

Arxiv

1+阅读 · 2022年4月17日

Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Arxiv

0+阅读 · 2022年4月15日

Selecting Continuous Life-Like Cellular Automata for Halting Unpredictability: Evolving for Abiogenesis

Selecting Continuous Life-Like Cellular Automata for Halting Unpredictability: Evolving for Abiogenesis

Arxiv

0+阅读 · 2022年4月15日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月15日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

相关基金

MEMS声衬及分布式阻抗控制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于细胞凋亡的MR/PET分子影像对非小细胞肺癌疗效评估的研究

国家自然科学基金

0+阅读 · 2013年12月31日

自适应移动Kriging插值响应面可靠性分析方法及其应用研究

国家自然科学基金

1+阅读 · 2013年12月31日

Pictet–Spengler类反应机理的理论研究和新反应设计

国家自然科学基金

0+阅读 · 2013年12月31日

高维统计模型中的稳健推断及其应用

国家自然科学基金

1+阅读 · 2012年12月31日

实值多变量维数约简研究及应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于事件的强化学习及其在群机器人优化控制中的应用

国家自然科学基金

3+阅读 · 2012年12月31日

云南双江—沧源新近纪植物多样性与古环境

国家自然科学基金

0+阅读 · 2011年12月31日

高速移动条件下无线宽带网络资源管理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

NOX在UVA诱导的肥大细胞胞浆钙振荡中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员