使用模拟优化来改进四重点的零射政策传输 (Using Simulation Optimization to Improve Zero-shot Policy Transfer of Quadrotors) - 专知论文

会员服务 ·

0

控制器 · 优化器 · Extensibility · 强化学习 · 学成 ·

2022 年 1 月 4 日

Using Simulation Optimization to Improve Zero-shot Policy Transfer of Quadrotors

翻译：使用模拟优化来改进四重点的零射政策传输

Sven Gronauer,Matthias Kissel,Luca Sacchetto,Mathias Korte,Klaus Diepold

In this work, we show that it is possible to train low-level control policies with reinforcement learning entirely in simulation and, then, deploy them on a quadrotor robot without using real-world data to fine-tune. To render zero-shot policy transfers feasible, we apply simulation optimization to narrow the reality gap. Our neural network-based policies use only onboard sensor data and run entirely on the embedded drone hardware. In extensive real-world experiments, we compare three different control structures ranging from low-level pulse-width-modulated motor commands to high-level attitude control based on nested proportional-integral-derivative controllers. Our experiments show that low-level controllers trained with reinforcement learning require a more accurate simulation than higher-level control policies.

翻译：在这项工作中,我们证明有可能培训低层次的控制政策,在完全模拟中进行强化学习,然后将其运用于一个二次模型机器人,而不使用真实世界数据进行微调。为了使零射政策转移成为可行,我们应用模拟优化来缩小现实差距。我们的神经网络政策只使用机载传感器数据,完全使用嵌入的无人机硬件。在广泛的现实世界实验中,我们比较了三种不同的控制结构,从低波脉冲调动指令到基于嵌入的成型成比例成形控制器的高层态度控制。我们的实验显示,受过强化学习训练的低级别的控制者需要比高层控制政策更精确的模拟。

0

相关内容

控制器

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

受体MDSCs通过CEACAM1-TIM3调控NK细胞功能介导肝移植免疫耐受的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

ADS-B大数据环境下的机场滑行时间预测及优化关键技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

JNK-Annexin A7 信号转导通路对小鼠腹水型肝癌干细胞生物学功能的影响

国家自然科学基金

0+阅读 · 2015年12月31日

电子商务环境下基于智能优化算法的订单调度问题的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于tau蛋白代谢通路基因多态性和多模态fMRI的遗忘型轻度认知障碍神经网络机制探讨

国家自然科学基金

0+阅读 · 2012年12月31日

异质信道盲会合算法

国家自然科学基金

0+阅读 · 2012年12月31日

基于二次规划的大规模非线性半定规划问题的理论、算法研究及软件设计

国家自然科学基金

0+阅读 · 2012年12月31日

下一代移动通信上行链路迭代接收机形式化设计方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Landau-Brazovsky模型约束最优问题

国家自然科学基金

0+阅读 · 2011年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

Memory-Constrained Policy Optimization

Arxiv

0+阅读 · 2022年4月20日

Identifying Near-Optimal Single-Shot Attacks on ICSs with Limited Process Knowledge

Arxiv

0+阅读 · 2022年4月19日

Table-based Fact Verification with Self-adaptive Mixture of Experts

Arxiv

0+阅读 · 2022年4月19日

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

LEGOStore: A Linearizable Geo-Distributed Store Combining Replication and Erasure Coding

Arxiv

0+阅读 · 2022年4月18日

On Safety Testing, Validation, and Characterization with Scenario-Sampling: A Case Study of Legged Robots

Arxiv

1+阅读 · 2022年4月16日

Warped Dynamic Linear Models for Time Series of Counts

Warped Dynamic Linear Models for Time Series of Counts

Arxiv

0+阅读 · 2022年4月15日

Divide & Conquer Imitation Learning

Arxiv

0+阅读 · 2022年4月15日

Semi-Supervised AUC Optimization based on Positive-Unlabeled Learning

Arxiv

0+阅读 · 2022年4月11日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Memory-Constrained Policy Optimization

Arxiv

0+阅读 · 2022年4月20日

Identifying Near-Optimal Single-Shot Attacks on ICSs with Limited Process Knowledge

Arxiv

0+阅读 · 2022年4月19日

Table-based Fact Verification with Self-adaptive Mixture of Experts

Arxiv

0+阅读 · 2022年4月19日

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

LEGOStore: A Linearizable Geo-Distributed Store Combining Replication and Erasure Coding

Arxiv

0+阅读 · 2022年4月18日

On Safety Testing, Validation, and Characterization with Scenario-Sampling: A Case Study of Legged Robots

Arxiv

1+阅读 · 2022年4月16日

Warped Dynamic Linear Models for Time Series of Counts

Warped Dynamic Linear Models for Time Series of Counts

Arxiv

0+阅读 · 2022年4月15日

Divide & Conquer Imitation Learning

Arxiv

0+阅读 · 2022年4月15日

Semi-Supervised AUC Optimization based on Positive-Unlabeled Learning

Arxiv

0+阅读 · 2022年4月11日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

相关基金

受体MDSCs通过CEACAM1-TIM3调控NK细胞功能介导肝移植免疫耐受的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

ADS-B大数据环境下的机场滑行时间预测及优化关键技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

JNK-Annexin A7 信号转导通路对小鼠腹水型肝癌干细胞生物学功能的影响

国家自然科学基金

0+阅读 · 2015年12月31日

电子商务环境下基于智能优化算法的订单调度问题的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于tau蛋白代谢通路基因多态性和多模态fMRI的遗忘型轻度认知障碍神经网络机制探讨

国家自然科学基金

0+阅读 · 2012年12月31日

异质信道盲会合算法

国家自然科学基金

0+阅读 · 2012年12月31日

基于二次规划的大规模非线性半定规划问题的理论、算法研究及软件设计

国家自然科学基金

0+阅读 · 2012年12月31日

下一代移动通信上行链路迭代接收机形式化设计方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Landau-Brazovsky模型约束最优问题

国家自然科学基金

0+阅读 · 2011年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员