适应环境行为 (Adapting to the Behavior of Environments with Bounded Memory) - 专知论文

会员服务 ·

0

回合 · INFORMS · 控制器 · 学成 · 缩放 ·

2021 年 9 月 17 日

Adapting to the Behavior of Environments with Bounded Memory

翻译：适应环境行为

Dhananjay Raju,Rüdiger Ehlers,Ufuk Topcu

from arxiv, In Proceedings GandALF 2021, arXiv:2109.07798

We study the problem of synthesizing implementations from temporal logic specifications that need to work correctly in all environments that can be represented as transducers with a limited number of states. This problem was originally defined and studied by Kupferman, Lustig, Vardi, and Yannakakis. They provide NP and 2-EXPTIME lower and upper bounds (respectively) for the complexity of this problem, in the size of the transducer. We tighten the gap by providing a PSPACE lower bound, thereby showing that algorithms for solving this problem are unlikely to scale to large environment sizes. This result is somewhat unfortunate as solving this problem enables tackling some high-level control problems in which an agent has to infer the environment behavior from observations. To address this observation, we study a modified synthesis problem in which the synthesized controller must gather information about the environment's behavior safely. We show that the problem of determining whether the behavior of such an environment can be safely learned is only co-NP-complete. Furthermore, in such scenarios, the behavior of the environment can be learned using a Turing machine that requires at most polynomial space in the size of the environment's transducer.

翻译：我们研究从时间逻辑规范中综合执行的问题,这些逻辑规范需要在所有环境中正确工作,这些逻辑规范需要在所有环境中以数量有限的国家作为转换器。这个问题最初由Kupferman、 Lustig、 Vardi 和 Yannakakis 界定和研究。这个问题最初由 Kupferman、 Lustig、 Vardi 和 Yannakakis 来界定和研究。它们为这一问题的复杂性提供了NP 和 2- EXPTIME 下层和上界( 分别) 。我们通过提供低限的 PSPACE 来缩小差距, 从而显示解决这一问题的算法不可能扩大到大环境大小。这个结果有些不幸, 因为解决这个问题能够解决某些高层控制问题, 使一个代理从观测中推断环境行为。为了解决这一观察, 我们研究一个经过修改的综合问题, 综合控制器必须安全地收集环境行为的信息。我们表明, 确定这种环境的行为能否安全地学到的问题只是共同- NPEPEE 。此外, 在这种假设中, 环境的行为可以学习到在最多元空间大小的环境中需要的图式机器。

0

相关内容

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

专知会员服务

53+阅读 · 2020年8月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

专知会员服务

59+阅读 · 2020年6月30日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Asynchronous Gathering Algorithms for Autonomous Mobile Robots with Lights

Arxiv

0+阅读 · 2021年11月9日

On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms

Arxiv

0+阅读 · 2021年11月7日

Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments

Arxiv

3+阅读 · 2020年12月8日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Object-centric Forward Modeling for Model Predictive Control

Object-centric Forward Modeling for Model Predictive Control

Arxiv

5+阅读 · 2019年10月8日

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Arxiv

3+阅读 · 2019年3月6日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Visual Reinforcement Learning with Imagined Goals

Arxiv

8+阅读 · 2018年7月12日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Arxiv

6+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

专知会员服务

53+阅读 · 2020年8月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

专知会员服务

59+阅读 · 2020年6月30日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】VideoLucy：用于长视频理解的深度记忆回溯机制

不确定环境下无人机与无人地面车辆编队的地下勘探规划算法 | 122页

【NTU博士论文】端到端鲁棒自动语音识别的最新进展

用于强化学习的扩散模型：基础、分类与发展

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Asynchronous Gathering Algorithms for Autonomous Mobile Robots with Lights

Arxiv

0+阅读 · 2021年11月9日

On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms

Arxiv

0+阅读 · 2021年11月7日

Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments

Arxiv

3+阅读 · 2020年12月8日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Object-centric Forward Modeling for Model Predictive Control

Object-centric Forward Modeling for Model Predictive Control

Arxiv

5+阅读 · 2019年10月8日

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Arxiv

3+阅读 · 2019年3月6日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Visual Reinforcement Learning with Imagined Goals

Arxiv

8+阅读 · 2018年7月12日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Arxiv

6+阅读 · 2018年1月16日

微信扫码咨询专知VIP会员