强化学习期间保证满足时间逻辑限制的可能性 (Probabilistically Guaranteed Satisfaction of Temporal Logic Constraints During Reinforcement Learning) - 专知论文

会员服务 ·

0

约束 · Performer · 学成 · 强化学习 · Processing（编程语言） ·

2021 年 2 月 19 日

Probabilistically Guaranteed Satisfaction of Temporal Logic Constraints During Reinforcement Learning

翻译：强化学习期间保证满足时间逻辑限制的可能性

Derya Aksaray,Yasin Yazicioglu,Ahmet Semi Asarkaya

We present a novel reinforcement learning algorithm for finding optimal policies in Markov Decision Processes while satisfying temporal logic constraints with a desired probability throughout the learning process. An automata-theoretic approach is proposed to ensure probabilistic satisfaction of the constraint in each episode, which is different from penalizing violations to achieve constraint satisfaction after a sufficiently large number of episodes. The proposed approach is based on computing a lower bound on the probability of constraint satisfaction and adjusting the exploration behavior as needed. We present theoretical results on the probabilistic constraint satisfaction achieved by the proposed approach. We also numerically demonstrate the proposed idea in a drone scenario, where the constraint is to perform periodically arriving pick-up and delivery tasks and the objective is to fly over high-reward zones to simultaneously perform aerial monitoring.

翻译：我们提出了一个新的强化学习算法,用于在Markov决定程序中寻找最佳政策,同时满足时间逻辑限制,在整个学习过程中达到理想的概率。我们提议采用自成一体的理论方法,以确保每个事件都能够顺利地满足限制,这不同于惩罚违规行为,以便在足够多的事件发生后达到约束性满足。我们提议的方法基于较低的约束性满意度约束度,并根据需要调整勘探行为。我们提出了关于拟议方法实现的概率性约束性满意度的理论结果。我们还从数字上展示了无人机情景中的拟议想法,其中的限制是定期完成接收和交付任务,目标是飞越高纬度区域,同时进行空中监测。

0

相关内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

专知会员服务

26+阅读 · 2019年10月3日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

已删除

将门创投

4+阅读 · 2019年6月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Learning Context-Adaptive Task Constraints for Robotic Manipulation

Arxiv

0+阅读 · 2021年4月13日

The Programming of Deep Learning Accelerators as a Constraint Satisfaction Problem

The Programming of Deep Learning Accelerators as a Constraint Satisfaction Problem

Arxiv

0+阅读 · 2021年4月13日

Particle MPC for Uncertain and Learning-Based Control

Arxiv

0+阅读 · 2021年4月13日

Revisiting the Approximate Carathéodory Problem via the Frank-Wolfe Algorithm

Arxiv

0+阅读 · 2021年4月12日

Convergence of Adaptive, Randomized, Iterative Linear Solvers

Arxiv

0+阅读 · 2021年4月10日

Sampling Constraint Satisfaction Solutions in the Local Lemma Regime

Arxiv

0+阅读 · 2021年4月10日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

Automatic Face Aging in Videos via Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年11月27日

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Arxiv

3+阅读 · 2018年9月4日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

专知会员服务

26+阅读 · 2019年10月3日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《人与智能体在系统工程建模语言V2任务中的性能表现：基于用户中心化的评估方法》308页

《数据安全国家标准体系（2025版）》征求意见稿

AlphaMosaic：人工智能赋能的作战管理系统

《军事行动中通信平台的战略价值：提升战术效能与作战优势》

相关资讯

已删除

将门创投

4+阅读 · 2019年6月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

相关论文

Learning Context-Adaptive Task Constraints for Robotic Manipulation

Arxiv

0+阅读 · 2021年4月13日

The Programming of Deep Learning Accelerators as a Constraint Satisfaction Problem

The Programming of Deep Learning Accelerators as a Constraint Satisfaction Problem

Arxiv

0+阅读 · 2021年4月13日

Particle MPC for Uncertain and Learning-Based Control

Arxiv

0+阅读 · 2021年4月13日

Revisiting the Approximate Carathéodory Problem via the Frank-Wolfe Algorithm

Arxiv

0+阅读 · 2021年4月12日

Convergence of Adaptive, Randomized, Iterative Linear Solvers

Arxiv

0+阅读 · 2021年4月10日

Sampling Constraint Satisfaction Solutions in the Local Lemma Regime

Arxiv

0+阅读 · 2021年4月10日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

Automatic Face Aging in Videos via Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年11月27日

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Arxiv

3+阅读 · 2018年9月4日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

微信扫码咨询专知VIP会员