目标导航中奖赏形成的作用 (Role of reward shaping in object-goal navigation) - 专知论文

会员服务 ·

0

Learning · 塑造 · 回合 · Agent · CASES ·

2022 年 7 月 16 日

Role of reward shaping in object-goal navigation

翻译：目标导航中奖赏形成的作用

Srirangan Madhavan,Anwesan Pal,Henrik I. Christensen

from arxiv, Paper Accepted at Third Embodied AI Workshop at CVPR 2022

Deep reinforcement learning approaches have been a popular method for visual navigation tasks in the computer vision and robotics community of late. In most cases, the reward function has a binary structure, i.e., a large positive reward is provided when the agent reaches goal state, and a negative step penalty is assigned for every other state in the environment. A sparse signal like this makes the learning process challenging, specially in big environments, where a large number of sequential actions need to be taken to reach the target. We introduce a reward shaping mechanism which gradually adjusts the reward signal based on distance to the goal. Detailed experiments conducted using the AI2-THOR simulation environment demonstrate the efficacy of the proposed approach for object-goal navigation tasks.

翻译：深层强化学习方法是计算机视觉和机器人界近来最受欢迎的视觉导航任务方法,在大多数情况下,奖励功能具有二元结构,即当代理人达到目标状态时提供大量正面奖励,对环境中的每一个其他国家给予负级惩罚。这种微小的信号使得学习过程具有挑战性,特别是在大环境中,需要采取大量的相继行动才能达到目标。我们引入了一个奖赏塑造机制,根据距离目标的距离逐步调整奖赏信号。使用AI2-THOR模拟环境进行的详细实验展示了目标目标导航任务的拟议方法的有效性。

0

相关内容

Learning

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

MALT1在γδT细胞功能分化中的作用及其调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

Rac1信号通路在糖尿病肾病足细胞转分化中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

功能化石墨烯量子点合成与荧光传感

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

SARI转录抑制机制及在急性髓细胞白血病发病中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

PM2.5暴露诱发胰岛素抵抗的分子作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

钽基多孔纳米复合材料的制备及表面等离子体增强可见光催化制氢性能

国家自然科学基金

0+阅读 · 2012年12月31日

肝脏树突状细胞依赖IL-27通路调控小鼠肝移植免疫耐受机制

国家自然科学基金

0+阅读 · 2011年12月31日

针灸治疗大鼠CD肠纤维化Smads与ERK-1/2MAPK信号通路Cross talk研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Memory-Augmented Reinforcement Learning for Image-Goal Navigation

Arxiv

0+阅读 · 2022年9月12日

An Investigation of Smart Contract for Collaborative Machine Learning Model Training

Arxiv

0+阅读 · 2022年9月12日

Federated Reinforcement Learning for Collective Navigation of Robotic Swarms

Arxiv

0+阅读 · 2022年9月11日

Active Learning for Optimal Intervention Design in Causal Models

Arxiv

0+阅读 · 2022年9月10日

PetLock:A Genderless and Standard Interface for the Future On-orbit Construction

PetLock:A Genderless and Standard Interface for the Future On-orbit Construction

Arxiv

0+阅读 · 2022年9月9日

Strategyproof Scheduling with Predictions

Arxiv

0+阅读 · 2022年9月8日

Sequential Information Design: Learning to Persuade in the Dark

Arxiv

0+阅读 · 2022年9月8日

Aerial View Goal Localization with Reinforcement Learning

Arxiv

0+阅读 · 2022年9月8日

On the failure of beam-like topologically interlocked structures

Arxiv

0+阅读 · 2022年9月7日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Arxiv

10+阅读 · 2022年2月10日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Memory-Augmented Reinforcement Learning for Image-Goal Navigation

Arxiv

0+阅读 · 2022年9月12日

An Investigation of Smart Contract for Collaborative Machine Learning Model Training

Arxiv

0+阅读 · 2022年9月12日

Federated Reinforcement Learning for Collective Navigation of Robotic Swarms

Arxiv

0+阅读 · 2022年9月11日

Active Learning for Optimal Intervention Design in Causal Models

Arxiv

0+阅读 · 2022年9月10日

PetLock:A Genderless and Standard Interface for the Future On-orbit Construction

PetLock:A Genderless and Standard Interface for the Future On-orbit Construction

Arxiv

0+阅读 · 2022年9月9日

Strategyproof Scheduling with Predictions

Arxiv

0+阅读 · 2022年9月8日

Sequential Information Design: Learning to Persuade in the Dark

Arxiv

0+阅读 · 2022年9月8日

Aerial View Goal Localization with Reinforcement Learning

Arxiv

0+阅读 · 2022年9月8日

On the failure of beam-like topologically interlocked structures

Arxiv

0+阅读 · 2022年9月7日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Arxiv

10+阅读 · 2022年2月10日

相关基金

MALT1在γδT细胞功能分化中的作用及其调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

Rac1信号通路在糖尿病肾病足细胞转分化中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

功能化石墨烯量子点合成与荧光传感

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

SARI转录抑制机制及在急性髓细胞白血病发病中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

PM2.5暴露诱发胰岛素抵抗的分子作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

钽基多孔纳米复合材料的制备及表面等离子体增强可见光催化制氢性能

国家自然科学基金

0+阅读 · 2012年12月31日

肝脏树突状细胞依赖IL-27通路调控小鼠肝移植免疫耐受机制

国家自然科学基金

0+阅读 · 2011年12月31日

针灸治疗大鼠CD肠纤维化Smads与ERK-1/2MAPK信号通路Cross talk研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员