LTL-受约束的固定国家政策综述 (LTL-Constrained Steady-State Policy Synthesis) - 专知论文

会员服务 ·

0

可约的 · 线性的 · SimPLe · Processing（编程语言） · 样例 ·

2021 年 5 月 31 日

LTL-Constrained Steady-State Policy Synthesis

翻译：LTL-受约束的固定国家政策综述

Jan Křetínský

Decision-making policies for agents are often synthesized with the constraint that a formal specification of behaviour is satisfied. Here we focus on infinite-horizon properties. On the one hand, Linear Temporal Logic (LTL) is a popular example of a formalism for qualitative specifications. On the other hand, Steady-State Policy Synthesis (SSPS) has recently received considerable attention as it provides a more quantitative and more behavioural perspective on specifications, in terms of the frequency with which states are visited. Finally, rewards provide a classic framework for quantitative properties. In this paper, we study Markov decision processes (MDP) with the specification combining all these three types. The derived policy maximizes the reward among all policies ensuring the LTL specification with the given probability and adhering to the steady-state constraints. To this end, we provide a unified solution reducing the multi-type specification to a multi-dimensional long-run average reward. This is enabled by Limit-Deterministic B\"uchi Automata (LDBA), recently studied in the context of LTL model checking on MDP, and allows for an elegant solution through a simple linear programme. The algorithm also extends to the general $\omega$-regular properties and runs in time polynomial in the sizes of the MDP as well as the LDBA.

翻译：代理商的决策政策往往被综合为符合正式行为规范的制约。在这里,我们侧重于无限偏顺特性。一方面, 线性时空逻辑(LTL)是典型定性规范的流行范例。另一方面, 稳态国家政策综合(SSPS)最近受到相当的重视,因为它从访问国家的频率的角度,对规格提供了更加量化和更加行为的观点。最后, 奖励为量化属性提供了一个典型的框架。在本文中, 我们用所有这三种类型的规格来研究Markov 决策过程(MDP ) 。衍生政策是所有政策中确保LTL规格的奖励最大化的,具有给定的概率并坚持稳定状态制约。为此,我们提供了一种统一的解决办法,将多型规格降低为多层面的长期平均奖励。这得益于最近在LTL模型中研究的量化属性模型检查,并允许通过简单的线性方案,通过简单的线性方案,获得优雅的解决方案。作为MAM的常规和常规,MLA。

0

相关内容

可约的

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

专知会员服务

33+阅读 · 2020年4月24日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

专知会员服务

10+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

已删除

将门创投

11+阅读 · 2019年7月4日

Github项目推荐 | 最优控制、强化学习和运动规划等主题参考文献集锦

Github项目推荐 | 最优控制、强化学习和运动规划等主题参考文献集锦

AI研习社

3+阅读 · 2019年4月21日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Approximate Trace Reconstruction via Median String (in Average-Case)

Approximate Trace Reconstruction via Median String (in Average-Case)

Arxiv

0+阅读 · 2021年7月20日

Bayesian beta nonlinear models with constrained parameters to describe ruminal degradation kinetics

Arxiv

0+阅读 · 2021年7月20日

Global Winning Conditions in Synthesis of Distributed Systems with Causal Memory

Arxiv

0+阅读 · 2021年7月20日

Rapid Mixing of Glauber Dynamics up to Uniqueness via Contraction

Arxiv

0+阅读 · 2021年7月18日

Regression model selection via log-likelihood ratio and constrained minimum criterion

Arxiv

0+阅读 · 2021年7月18日

Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses

Arxiv

0+阅读 · 2021年7月18日

Achievable Rate with Antenna Size Constraint: Shannon meets Chu and Bode

Arxiv

0+阅读 · 2021年7月16日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

专知会员服务

33+阅读 · 2020年4月24日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

专知会员服务

10+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

卫星导航技术发展综述

《美军"僚机"联合能力技术演示项目：有人-无人火炮作战》41页报告

美军条令《火力指挥》116页

可解释的人工智能在生物医学图像分析中的应用综述

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

已删除

将门创投

11+阅读 · 2019年7月4日

Github项目推荐 | 最优控制、强化学习和运动规划等主题参考文献集锦

Github项目推荐 | 最优控制、强化学习和运动规划等主题参考文献集锦

AI研习社

3+阅读 · 2019年4月21日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Approximate Trace Reconstruction via Median String (in Average-Case)

Approximate Trace Reconstruction via Median String (in Average-Case)

Arxiv

0+阅读 · 2021年7月20日

Bayesian beta nonlinear models with constrained parameters to describe ruminal degradation kinetics

Arxiv

0+阅读 · 2021年7月20日

Global Winning Conditions in Synthesis of Distributed Systems with Causal Memory

Arxiv

0+阅读 · 2021年7月20日

Rapid Mixing of Glauber Dynamics up to Uniqueness via Contraction

Arxiv

0+阅读 · 2021年7月18日

Regression model selection via log-likelihood ratio and constrained minimum criterion

Arxiv

0+阅读 · 2021年7月18日

Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses

Arxiv

0+阅读 · 2021年7月18日

Achievable Rate with Antenna Size Constraint: Shannon meets Chu and Bode

Arxiv

0+阅读 · 2021年7月16日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

微信扫码咨询专知VIP会员