具有地形制约的多目标政策级级 (Multi-Objective Policy Gradients with Topological Constraints) - 专知论文

会员服务 ·

0

Learning · Extensibility · Markov · 约束 · 优化器 ·

2022 年 9 月 15 日

Multi-Objective Policy Gradients with Topological Constraints

翻译：具有地形制约的多目标政策级级

Kyle Hollins Wray,Stas Tiomkin,Mykel J. Kochenderfer,Pieter Abbeel

Multi-objective optimization models that encode ordered sequential constraints provide a solution to model various challenging problems including encoding preferences, modeling a curriculum, and enforcing measures of safety. A recently developed theory of topological Markov decision processes (TMDPs) captures this range of problems for the case of discrete states and actions. In this work, we extend TMDPs towards continuous spaces and unknown transition dynamics by formulating, proving, and implementing the policy gradient theorem for TMDPs. This theoretical result enables the creation of TMDP learning algorithms that use function approximators, and can generalize existing deep reinforcement learning (DRL) approaches. Specifically, we present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm. We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.

翻译：将顺序限制编码的多目标优化模型为模拟各种具有挑战性的问题提供了一种解决办法,包括编码偏好、课程建模和强制执行安全措施。最近开发的顶层学Markov决策程序理论(TMDPs)为离散状态和行动收集了这一系列问题。在这项工作中,我们通过为TMDPs制定、证明和实施政策梯度理论,将TMDP扩大到连续空间和未知的过渡动态。这一理论结果使得TMDP学习算法得以创建,这些算法使用功能相近的功能,并可以推广现有的深度加固学习(DRL)方法。具体地说,我们为TMDPs提供了一种新的政策梯度算法,简单扩展了准政策优化(PPO)算法。我们在现实世界的多目标导航问题上展示了这一点,在模拟和真正的机器人上任意定了目标。

0

相关内容

Learning

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

新型细胞因子PGRN抑制A型流感病毒增殖的分子机制

国家自然科学基金

0+阅读 · 2016年12月31日

同型半胱氨酸经ERK通路上调ETB受体表达促血管平滑肌细胞增殖机制

国家自然科学基金

0+阅读 · 2015年12月31日

线粒体自噬-Warburg效应介导apelin促血管平滑肌细胞增殖

国家自然科学基金

0+阅读 · 2014年12月31日

Tip60在oxLDL诱导的血管平滑肌细胞自噬及增殖中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于CAS-CA建模的山地城市适应性规划分析方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Caldesmon调节血管平滑肌细胞参与血管内膜增生的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

IKCa通道在周期性牵张调控血管平滑肌细胞表型转变、增殖、迁移中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于GIS的森林资源调查空间平衡抽样理论与方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

RBP4对血管平滑肌细胞增殖、迁移及炎症因子表达的影响及机制

国家自然科学基金

0+阅读 · 2009年12月31日

Provably Efficient Model-Free Constrained RL with Linear Function Approximation

Arxiv

0+阅读 · 2022年10月26日

Topological Regularization for Dense Prediction

Arxiv

0+阅读 · 2022年10月25日

Off-Policy Correction for Actor-Critic Methods without Importance Sampling

Arxiv

0+阅读 · 2022年10月24日

Biological Sequence Design with GFlowNets

Arxiv

0+阅读 · 2022年10月24日

Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning

Arxiv

0+阅读 · 2022年10月23日

Neural Network Approximations of PDEs Beyond Linearity: Representational Perspective

Arxiv

0+阅读 · 2022年10月21日

Biologically Plausible Variational Policy Gradient with Spiking Recurrent Winner-Take-All Networks

Arxiv

0+阅读 · 2022年10月21日

Provable General Function Class Representation Learning in Multitask Bandits and MDPs

Arxiv

0+阅读 · 2022年10月21日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Provably Efficient Model-Free Constrained RL with Linear Function Approximation

Arxiv

0+阅读 · 2022年10月26日

Topological Regularization for Dense Prediction

Arxiv

0+阅读 · 2022年10月25日

Off-Policy Correction for Actor-Critic Methods without Importance Sampling

Arxiv

0+阅读 · 2022年10月24日

Biological Sequence Design with GFlowNets

Arxiv

0+阅读 · 2022年10月24日

Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning

Arxiv

0+阅读 · 2022年10月23日

Neural Network Approximations of PDEs Beyond Linearity: Representational Perspective

Arxiv

0+阅读 · 2022年10月21日

Biologically Plausible Variational Policy Gradient with Spiking Recurrent Winner-Take-All Networks

Arxiv

0+阅读 · 2022年10月21日

Provable General Function Class Representation Learning in Multitask Bandits and MDPs

Arxiv

0+阅读 · 2022年10月21日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

新型细胞因子PGRN抑制A型流感病毒增殖的分子机制

国家自然科学基金

0+阅读 · 2016年12月31日

同型半胱氨酸经ERK通路上调ETB受体表达促血管平滑肌细胞增殖机制

国家自然科学基金

0+阅读 · 2015年12月31日

线粒体自噬-Warburg效应介导apelin促血管平滑肌细胞增殖

国家自然科学基金

0+阅读 · 2014年12月31日

Tip60在oxLDL诱导的血管平滑肌细胞自噬及增殖中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于CAS-CA建模的山地城市适应性规划分析方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Caldesmon调节血管平滑肌细胞参与血管内膜增生的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

IKCa通道在周期性牵张调控血管平滑肌细胞表型转变、增殖、迁移中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于GIS的森林资源调查空间平衡抽样理论与方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

RBP4对血管平滑肌细胞增殖、迁移及炎症因子表达的影响及机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员