南极顺序蒙特卡洛 (Critic Sequential Monte Carlo) - 专知论文

会员服务 ·

0

蒙特卡罗 · 评论员 · Notability · 学成 · Microsoft Surface ·

2022 年 5 月 30 日

Critic Sequential Monte Carlo

翻译：南极顺序蒙特卡洛

Vasileios Lioutas,Jonathan Wilder Lavington,Justice Sefas,Matthew Niedoba,Yunpeng Liu,Berend Zwartsenberg,Setareh Dabiri,Frank Wood,Adam Scibior

from arxiv, 20 pages, 3 figures

We introduce CriticSMC, a new algorithm for planning as inference built from a novel composition of sequential Monte Carlo with learned soft-Q function heuristic factors. This algorithm is structured so as to allow using large numbers of putative particles leading to efficient utilization of computational resource and effective discovery of high reward trajectories even in environments with difficult reward surfaces such as those arising from hard constraints. Relative to prior art our approach is notably still compatible with model-free reinforcement learning in the sense that the implicit policy we produce can be used at test time in the absence of a world model. Our experiments on self-driving car collision avoidance in simulation demonstrate improvements against baselines in terms of infraction minimization relative to computational effort while maintaining diversity and realism of found trajectories.

翻译：我们引入了CriticSMC(CriticSMC),这是一个规划的新算法,它从相继的蒙特卡洛的新构成中推导出,具有学习的软Q功能超常因素,这种算法的结构允许使用大量模拟粒子,从而有效利用计算资源,并有效发现高回报轨迹,即使在诸如困难的奖赏表层,例如来自困难的制约物的环境里也是如此。与以往的艺术相比,我们的方法仍然明显地与无模型的强化学习相容,也就是说,在没有世界模型的情况下,我们在试验时可以使用我们制定的隐性政策。我们在模拟中进行自我驾驶避免汽车碰撞的实验表明,与计算努力相比,最小化与计算努力相对的基线相悖,同时保持所发现轨迹的多样性和现实主义。

0

相关内容

蒙特卡罗

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【SIGIR2021教程】基于强化学习的信息检索

专知会员服务

28+阅读 · 2021年7月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

f-电子材料中磁性量子临界、超导及阻挫的相互作用

国家自然科学基金

0+阅读 · 2014年12月31日

随机混合时滞系统的稳定性分析与脉冲控制器设计

国家自然科学基金

0+阅读 · 2013年12月31日

等离激元纳米结构辅助激光解吸附与电离的研究

国家自然科学基金

0+阅读 · 2012年12月31日

微机电系统的建模、分析与操控

国家自然科学基金

1+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

时域连续的高维Monte Carlo绘制技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于MUAV平台的ARGIS扩展技术

国家自然科学基金

1+阅读 · 2009年12月31日

ICF中高能电子和离子输运的Monte-Carlo算法研究和程序研制

国家自然科学基金

0+阅读 · 2009年12月31日

土壤氮素行为及其模拟模型不确定性的Monte-Carlo分析

国家自然科学基金

0+阅读 · 2008年12月31日

组合Web服务的建模与验证

国家自然科学基金

1+阅读 · 2008年12月31日

An Information-Theoretic Analysis of Bayesian Reinforcement Learning

Arxiv

0+阅读 · 2022年7月18日

Active Exploration for Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年7月18日

Confidence-rich Localization and Mapping based on Particle Filter for Robotic Exploration

Arxiv

0+阅读 · 2022年7月18日

Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence

Arxiv

0+阅读 · 2022年7月18日

BCRLSP: An Offline Reinforcement Learning Framework for Sequential Targeted Promotion

Arxiv

0+阅读 · 2022年7月16日

MARLAS: Multi Agent Reinforcement Learning for cooperated Adaptive Sampling

Arxiv

0+阅读 · 2022年7月15日

Heuristic-free Optimization of Force-Controlled Robot Search Strategies in Stochastic Environments

Arxiv

0+阅读 · 2022年7月15日

Parallel Monte Carlo Tree Search with Batched Rigid-body Simulations for Speeding up Long-Horizon Episodic Robot Planning

Arxiv

0+阅读 · 2022年7月14日

i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops

Arxiv

0+阅读 · 2022年7月14日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

Microsoft Surface

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【SIGIR2021教程】基于强化学习的信息检索

专知会员服务

28+阅读 · 2021年7月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

最新，DeepSeek-R1论文登上Nature封面，附83页补充材料

人工智能与未来战争

自动驾驶中的轨迹预测大型基础模型：全面综述

万字长文《对抗雷达系统的电子战综述》

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

An Information-Theoretic Analysis of Bayesian Reinforcement Learning

Arxiv

0+阅读 · 2022年7月18日

Active Exploration for Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年7月18日

Confidence-rich Localization and Mapping based on Particle Filter for Robotic Exploration

Arxiv

0+阅读 · 2022年7月18日

Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence

Arxiv

0+阅读 · 2022年7月18日

BCRLSP: An Offline Reinforcement Learning Framework for Sequential Targeted Promotion

Arxiv

0+阅读 · 2022年7月16日

MARLAS: Multi Agent Reinforcement Learning for cooperated Adaptive Sampling

Arxiv

0+阅读 · 2022年7月15日

Heuristic-free Optimization of Force-Controlled Robot Search Strategies in Stochastic Environments

Arxiv

0+阅读 · 2022年7月15日

Parallel Monte Carlo Tree Search with Batched Rigid-body Simulations for Speeding up Long-Horizon Episodic Robot Planning

Arxiv

0+阅读 · 2022年7月14日

i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops

Arxiv

0+阅读 · 2022年7月14日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

f-电子材料中磁性量子临界、超导及阻挫的相互作用

国家自然科学基金

0+阅读 · 2014年12月31日

随机混合时滞系统的稳定性分析与脉冲控制器设计

国家自然科学基金

0+阅读 · 2013年12月31日

等离激元纳米结构辅助激光解吸附与电离的研究

国家自然科学基金

0+阅读 · 2012年12月31日

微机电系统的建模、分析与操控

国家自然科学基金

1+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

时域连续的高维Monte Carlo绘制技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于MUAV平台的ARGIS扩展技术

国家自然科学基金

1+阅读 · 2009年12月31日

ICF中高能电子和离子输运的Monte-Carlo算法研究和程序研制

国家自然科学基金

0+阅读 · 2009年12月31日

土壤氮素行为及其模拟模型不确定性的Monte-Carlo分析

国家自然科学基金

0+阅读 · 2008年12月31日

组合Web服务的建模与验证

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员