积极学习,促进社会认识机器人导航 (Feedback-efficient Active Preference Learning for Socially Aware Robot Navigation) - 专知论文

会员服务 ·

0

Learning · Extensibility · 机器人 · Agent · 逆强化学习 ·

2022 年 7 月 31 日

Feedback-efficient Active Preference Learning for Socially Aware Robot Navigation

翻译：积极学习,促进社会认识机器人导航

Ruiqi Wang,Weizheng Wang,Byung-Cheol Min

from arxiv, To appear in IROS 2022

Socially aware robot navigation, where a robot is required to optimize its trajectory to maintain comfortable and compliant spatial interactions with humans in addition to reaching its goal without collisions, is a fundamental yet challenging task in the context of human-robot interaction. While existing learning-based methods have achieved better performance than the preceding model-based ones, they still have drawbacks: reinforcement learning depends on the handcrafted reward that is unlikely to effectively quantify broad social compliance, and can lead to reward exploitation problems; meanwhile, inverse reinforcement learning suffers from the need for expensive human demonstrations. In this paper, we propose a feedback-efficient active preference learning approach, FAPL, that distills human comfort and expectation into a reward model to guide the robot agent to explore latent aspects of social compliance. We further introduce hybrid experience learning to improve the efficiency of human feedback and samples, and evaluate benefits of robot behaviors learned from FAPL through extensive simulation experiments and a user study (N=10) employing a physical robot to navigate with human subjects in real-world scenarios. Source code and experiment videos for this work are available at:https://sites.google.com/view/san-fapl.

翻译：具有社会意识的机器人导航,其中机器人必须优化其轨道,以保持与人类的舒适和兼容的空间互动,并且不发生碰撞,这是人类-机器人互动方面一项根本但具有挑战性的任务。虽然现有的学习方法比以前基于模型的方法取得了较好的绩效,但它们仍然有缺点:强化学习取决于手工制作的奖励,这不可能有效地量化广泛的社会合规情况,并可能导致奖励剥削问题;同时,反向强化学习因需要昂贵的人类演示而受到影响。在本文件中,我们提议采用反馈高效的积极偏好学习方法,即FAPL,将人的舒适和期望注入奖励模式,以指导机器人代理人探索社会合规的潜在方面。我们进一步引入混合经验学习,以提高人类反馈和样本的效率,并评估通过广泛的模拟实验和用户研究(N=10),使用物理机器人在现实世界情景中与人类主题进行导航。这项工作的源代码和实验视频见:https://sites.gogle.com/view/san-fapl。

0

相关内容

Learning

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

专知会员服务

102+阅读 · 2020年6月21日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

超临界环境下自由射流剪切层的时空稳定性

国家自然科学基金

0+阅读 · 2013年12月31日

弱观测复杂海洋环境下AUV动态目标跟踪算法研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于GPU的directionlets域SAR图像相干斑噪声抑制并行算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

几类k元n维互连网络的交叉数算法研究及应用

国家自然科学基金

0+阅读 · 2012年12月31日

一种时空白噪声驱动的Navier-Stokes方程的隐格式

国家自然科学基金

0+阅读 · 2011年12月31日

基于化学反应飞秒相干控制的飞秒时间分辨相干Raman光谱仪的研制

国家自然科学基金

0+阅读 · 2011年12月31日

基于动态模型可信度的集成学习算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

改进的Unscented卡尔曼滤波与电池组SOC快速精确估计

国家自然科学基金

0+阅读 · 2008年12月31日

On the Generalization of Deep Reinforcement Learning Methods in the Problem of Local Navigation

On the Generalization of Deep Reinforcement Learning Methods in the Problem of Local Navigation

Arxiv

0+阅读 · 2022年9月28日

Learning Perceptual Hallucination for Multi-Robot Navigation in Narrow Hallways

Arxiv

0+阅读 · 2022年9月27日

Safe reinforcement learning of dynamic high-dimensional robotic tasks: navigation, manipulation, interaction

Arxiv

0+阅读 · 2022年9月27日

PUTN: A Plane-fitting based Uneven Terrain Navigation Framework

Arxiv

0+阅读 · 2022年9月27日

Feedback Motion Prediction for Safe Unicycle Robot Navigation

Arxiv

0+阅读 · 2022年9月26日

Impact of Feedback Type on Explanatory Interactive Learning

Arxiv

0+阅读 · 2022年9月26日

Barrier functions enable safety-conscious force-feedback control

Arxiv

0+阅读 · 2022年9月25日

Nonlinear Model Predictive Control of a 3D Hopping Robot: Leveraging Lie Group Integrators for Dynamically Stable Behaviors

Arxiv

0+阅读 · 2022年9月23日

Learning State Representations via Retracing in Reinforcement Learning

Arxiv

0+阅读 · 2022年9月23日

Quantification before Selection: Active Dynamics Preference for Robust Reinforcement Learning

Quantification before Selection: Active Dynamics Preference for Robust Reinforcement Learning

Arxiv

0+阅读 · 2022年9月23日

VIP会员

文章信息

相关主题

逆强化学习

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

专知会员服务

102+阅读 · 2020年6月21日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

On the Generalization of Deep Reinforcement Learning Methods in the Problem of Local Navigation

On the Generalization of Deep Reinforcement Learning Methods in the Problem of Local Navigation

Arxiv

0+阅读 · 2022年9月28日

Learning Perceptual Hallucination for Multi-Robot Navigation in Narrow Hallways

Arxiv

0+阅读 · 2022年9月27日

Safe reinforcement learning of dynamic high-dimensional robotic tasks: navigation, manipulation, interaction

Arxiv

0+阅读 · 2022年9月27日

PUTN: A Plane-fitting based Uneven Terrain Navigation Framework

Arxiv

0+阅读 · 2022年9月27日

Feedback Motion Prediction for Safe Unicycle Robot Navigation

Arxiv

0+阅读 · 2022年9月26日

Impact of Feedback Type on Explanatory Interactive Learning

Arxiv

0+阅读 · 2022年9月26日

Barrier functions enable safety-conscious force-feedback control

Arxiv

0+阅读 · 2022年9月25日

Nonlinear Model Predictive Control of a 3D Hopping Robot: Leveraging Lie Group Integrators for Dynamically Stable Behaviors

Arxiv

0+阅读 · 2022年9月23日

Learning State Representations via Retracing in Reinforcement Learning

Arxiv

0+阅读 · 2022年9月23日

Quantification before Selection: Active Dynamics Preference for Robust Reinforcement Learning

Quantification before Selection: Active Dynamics Preference for Robust Reinforcement Learning

Arxiv

0+阅读 · 2022年9月23日

相关基金

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

超临界环境下自由射流剪切层的时空稳定性

国家自然科学基金

0+阅读 · 2013年12月31日

弱观测复杂海洋环境下AUV动态目标跟踪算法研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于GPU的directionlets域SAR图像相干斑噪声抑制并行算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

几类k元n维互连网络的交叉数算法研究及应用

国家自然科学基金

0+阅读 · 2012年12月31日

一种时空白噪声驱动的Navier-Stokes方程的隐格式

国家自然科学基金

0+阅读 · 2011年12月31日

基于化学反应飞秒相干控制的飞秒时间分辨相干Raman光谱仪的研制

国家自然科学基金

0+阅读 · 2011年12月31日

基于动态模型可信度的集成学习算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

改进的Unscented卡尔曼滤波与电池组SOC快速精确估计

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员