从原始定位计量中学习控制 (Learning Control from Raw Position Measurements) - 专知论文

会员服务 ·

0

控制器 · Learning · 高斯过程回归 · MoDELS · Performer ·

2023 年 1 月 30 日

Learning Control from Raw Position Measurements

翻译：从原始定位计量中学习控制

Fabio Amadio,Alberto Dalla Libera,Daniel Nikovski,Ruggero Carli,Diego Romeres

from arxiv, Accepted at the 2023 American Control Conference (ACC)

We propose a Model-Based Reinforcement Learning (MBRL) algorithm named VF-MC-PILCO, specifically designed for application to mechanical systems where velocities cannot be directly measured. This circumstance, if not adequately considered, can compromise the success of MBRL approaches. To cope with this problem, we define a velocity-free state formulation which consists of the collection of past positions and inputs. Then, VF-MC-PILCO uses Gaussian Process Regression to model the dynamics of the velocity-free state and optimizes the control policy through a particle-based policy gradient approach. We compare VF-MC-PILCO with our previous MBRL algorithm, MC-PILCO4PMS, which handles the lack of direct velocity measurements by modeling the presence of velocity estimators. Results on both simulated (cart-pole and UR5 robot) and real mechanical systems (Furuta pendulum and a ball-and-plate rig) show that the two algorithms achieve similar results. Conveniently, VF-MC-PILCO does not require the design and implementation of state estimators, which can be a challenging and time-consuming activity to be performed by an expert user.

翻译：我们提出一个名为VF-MC-PILCO的模型强化学习算法(MBRL),专门设计用于无法直接测量速度的机械系统。如果没有充分考虑,这种情况会损害MBRL方法的成功。为了解决这个问题,我们定义了一个无速度状态配方,其中包括收集过去的位置和投入。然后,VF-MC-PILCO使用高斯进程回归模型来模拟无速度状态的动态,并通过基于粒子的政策梯度方法优化控制政策政策。我们将VF-MC-PILCO与我们以前的MBRL算法(MC-PILCO4PMS)进行比较,后者处理缺乏直接速度测量的问题,办法是模拟速度测算器的存在。模拟的(cart-pole和UR5机器人)和真正的机械系统(Furuta pentulum 和 ball-plag roit)的结果显示两种算法都取得了类似的结果。我们比较容易,VF-MC-MC-PIL-PIMS-PICO的算法需要一个具有挑战性的专家活动来进行设计和操作。

0

相关内容

控制器

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

发射可调铂(II)配合物的设计和新型静电喷雾沉积电致发光器件的制备

国家自然科学基金

0+阅读 · 2015年12月31日

重离子储存环CSRe上激光冷却相对论能量类锂12C3+离子束的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

GPM/DPR雷达反演三参数滴谱廓线方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

奇性空间上的几何分析

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

基于室温固体氧化物燃料电池的超晶格电解质界面效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

激光光热偏转光谱法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

强激光与多电子原子的相互作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

柴油机尾气排放NOx-PM-HC-CO污染物耦合催化去除的研究

国家自然科学基金

0+阅读 · 2008年12月31日

A Survey on Class Imbalance in Federated Learning

Arxiv

0+阅读 · 2023年3月21日

Multi-armed Bandit Learning on a Graph

Multi-armed Bandit Learning on a Graph

Arxiv

0+阅读 · 2023年3月20日

Nonparametric Simulation Extrapolation for Measurement Error Models

Arxiv

0+阅读 · 2023年3月20日

Data Might be Enough: Bridge Real-World Traffic Signal Control Using Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年3月20日

Augmented Reality in Service of Human Operations on the Moon: Insights from a Virtual Testbed

Arxiv

0+阅读 · 2023年3月19日

Extreme expectile estimation for short-tailed data, with an application to market risk assessment

Arxiv

0+阅读 · 2023年3月19日

A Policy Iteration Approach for Flock Motion Control

Arxiv

0+阅读 · 2023年3月17日

Efficient Learning of High Level Plans from Play

Arxiv

0+阅读 · 2023年3月16日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

Federated Learning Meets Natural Language Processing: A Survey

Arxiv

19+阅读 · 2021年7月27日

VIP会员

文章信息

相关主题

高斯过程回归

相关VIP内容

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

A Survey on Class Imbalance in Federated Learning

Arxiv

0+阅读 · 2023年3月21日

Multi-armed Bandit Learning on a Graph

Multi-armed Bandit Learning on a Graph

Arxiv

0+阅读 · 2023年3月20日

Nonparametric Simulation Extrapolation for Measurement Error Models

Arxiv

0+阅读 · 2023年3月20日

Data Might be Enough: Bridge Real-World Traffic Signal Control Using Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年3月20日

Augmented Reality in Service of Human Operations on the Moon: Insights from a Virtual Testbed

Arxiv

0+阅读 · 2023年3月19日

Extreme expectile estimation for short-tailed data, with an application to market risk assessment

Arxiv

0+阅读 · 2023年3月19日

A Policy Iteration Approach for Flock Motion Control

Arxiv

0+阅读 · 2023年3月17日

Efficient Learning of High Level Plans from Play

Arxiv

0+阅读 · 2023年3月16日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

Federated Learning Meets Natural Language Processing: A Survey

Arxiv

19+阅读 · 2021年7月27日

相关基金

发射可调铂(II)配合物的设计和新型静电喷雾沉积电致发光器件的制备

国家自然科学基金

0+阅读 · 2015年12月31日

重离子储存环CSRe上激光冷却相对论能量类锂12C3+离子束的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

GPM/DPR雷达反演三参数滴谱廓线方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

奇性空间上的几何分析

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

基于室温固体氧化物燃料电池的超晶格电解质界面效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

激光光热偏转光谱法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

强激光与多电子原子的相互作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

柴油机尾气排放NOx-PM-HC-CO污染物耦合催化去除的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员