政策优化与线性时时逻辑限制 (Policy Optimization with Linear Temporal Logic Constraints) - 专知论文

会员服务 ·

0

优化器 · 线性的 · Analysis · 代价 · 样本复杂度 ·

2022 年 6 月 20 日

Policy Optimization with Linear Temporal Logic Constraints

翻译：政策优化与线性时时逻辑限制

Cameron Voloshin,Hoang M. Le,Swarat Chaudhuri,Yisong Yue

from arxiv, Under Review at Neurips

We study the problem of policy optimization (PO) with linear temporal logic (LTL) constraints. The language of LTL allows flexible description of tasks that may be unnatural to encode as a scalar cost function. We consider LTL-constrained PO as a systematic framework, decoupling task specification from policy selection, and an alternative to the standard of cost shaping. With access to a generative model, we develop a model-based approach that enjoys a sample complexity analysis for guaranteeing both task satisfaction and cost optimality (through a reduction to a reachability problem). Empirically, our algorithm can achieve strong performance even in low sample regimes.

翻译：我们用线性时间逻辑(LTL)限制来研究政策优化问题,LTL的语言允许灵活地描述可能非自然的任务,将任务编码为计算成本功能。我们认为受LTL限制的PO是一个系统框架,将任务规格与政策选择脱钩,并取代成本形成标准。有了基因化模型,我们开发了一个基于模型的方法,通过抽样复杂分析(通过降低到可达性问题 ) 来保证任务满意度和成本最佳性。我们的算法即使在低抽样制度下也能取得强劲的绩效。

0

相关内容

优化器

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

【课程】Andrew Ng与Google Brain团队联合出品《TensorFlow in Practice 》

【课程】Andrew Ng与Google Brain团队联合出品《TensorFlow in Practice 》

专知会员服务

13+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

复多项式的核拓扑熵

国家自然科学基金

0+阅读 · 2015年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

多项式相位信号参数估计的迭代算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于低秩约束矩阵恢复的高维地震数据重建

国家自然科学基金

0+阅读 · 2013年12月31日

随机空间结构动力失稳性能及主动模糊控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

随机时滞动态网络的分岔控制与优化

国家自然科学基金

0+阅读 · 2012年12月31日

半参数回归分析的随机函数法及其高维情形

国家自然科学基金

2+阅读 · 2012年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于小波有限元的探地雷达正演模拟及偏移处理

国家自然科学基金

0+阅读 · 2008年12月31日

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年8月9日

Consistent Approximations in Composite Optimization

Arxiv

0+阅读 · 2022年8月8日

A Dual Accelerated Method for Online Stochastic Distributed Averaging: From Consensus to Decentralized Policy Evaluation

Arxiv

0+阅读 · 2022年8月8日

Sample and Computationally Efficient Stochastic Kriging in High Dimensions

Arxiv

0+阅读 · 2022年8月8日

Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions

Arxiv

0+阅读 · 2022年8月8日

Homological- and analytical-preserving serendipity framework for polytopal complexes, with application to the DDR method

Arxiv

0+阅读 · 2022年8月7日

Collision Avoidance for Dynamic Obstacles with Uncertain Predictions using Model Predictive Control

Arxiv

0+阅读 · 2022年8月6日

The Extended UCB Policies for Frequentist Multi-armed Bandit Problems

Arxiv

0+阅读 · 2022年8月6日

Distinction Maximization Loss: Efficiently Improving Out-of-Distribution Detection and Uncertainty Estimation by Replacing the Loss and Calibrating

Arxiv

0+阅读 · 2022年8月5日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

VIP会员

文章信息

相关主题

样本复杂度

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

【课程】Andrew Ng与Google Brain团队联合出品《TensorFlow in Practice 》

【课程】Andrew Ng与Google Brain团队联合出品《TensorFlow in Practice 》

专知会员服务

13+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年8月9日

Consistent Approximations in Composite Optimization

Arxiv

0+阅读 · 2022年8月8日

A Dual Accelerated Method for Online Stochastic Distributed Averaging: From Consensus to Decentralized Policy Evaluation

Arxiv

0+阅读 · 2022年8月8日

Sample and Computationally Efficient Stochastic Kriging in High Dimensions

Arxiv

0+阅读 · 2022年8月8日

Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions

Arxiv

0+阅读 · 2022年8月8日

Homological- and analytical-preserving serendipity framework for polytopal complexes, with application to the DDR method

Arxiv

0+阅读 · 2022年8月7日

Collision Avoidance for Dynamic Obstacles with Uncertain Predictions using Model Predictive Control

Arxiv

0+阅读 · 2022年8月6日

The Extended UCB Policies for Frequentist Multi-armed Bandit Problems

Arxiv

0+阅读 · 2022年8月6日

Distinction Maximization Loss: Efficiently Improving Out-of-Distribution Detection and Uncertainty Estimation by Replacing the Loss and Calibrating

Arxiv

0+阅读 · 2022年8月5日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

相关基金

复多项式的核拓扑熵

国家自然科学基金

0+阅读 · 2015年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

多项式相位信号参数估计的迭代算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于低秩约束矩阵恢复的高维地震数据重建

国家自然科学基金

0+阅读 · 2013年12月31日

随机空间结构动力失稳性能及主动模糊控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

随机时滞动态网络的分岔控制与优化

国家自然科学基金

0+阅读 · 2012年12月31日

半参数回归分析的随机函数法及其高维情形

国家自然科学基金

2+阅读 · 2012年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于小波有限元的探地雷达正演模拟及偏移处理

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员