以概率目标配置方式学习环境规划的背景强盗方法 (A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations) - 专知论文

会员服务 ·

0

回合 · 上下文赌博机/上下文老虎机 · 赌博机/老虎机 · Learning · 似然 ·

2022 年 11 月 29 日

A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations

翻译：以概率目标配置方式学习环境规划的背景强盗方法

Sohan Rudra,Saksham Goel,Anirban Santara,Claudio Gentile,Laurent Perron,Fei Xia,Vikas Sindhwani,Carolina Parada,Gaurav Aggarwal

from arxiv, Shorter version accepted at NeurIPS 2022 Workshop on Robot Learning: Trustworthy Robotics

Object-goal navigation (Object-nav) entails searching, recognizing and navigating to a target object. Object-nav has been extensively studied by the Embodied-AI community, but most solutions are often restricted to considering static objects (e.g., television, fridge, etc.). We propose a modular framework for object-nav that is able to efficiently search indoor environments for not just static objects but also movable objects (e.g. fruits, glasses, phones, etc.) that frequently change their positions due to human intervention. Our contextual-bandit agent efficiently explores the environment by showing optimism in the face of uncertainty and learns a model of the likelihood of spotting different objects from each navigable location. The likelihoods are used as rewards in a weighted minimum latency solver to deduce a trajectory for the robot. We evaluate our algorithms in two simulated environments and a real-world setting, to demonstrate high sample efficiency and reliability.

翻译：物体-目标导航(Object-nav)意味着搜索、识别和导航目标对象。对象-导航(Object-nav)已经由Embodied-AI社区进行了广泛的研究,但大多数解决办法往往局限于考虑静态物体(例如电视、冰箱等)。我们为物体-目标导航提出了一个模块框架,不仅能够高效率地搜索静态物体的室内环境,而且能够搜索经常因人类干预而改变其位置的移动物体(例如水果、眼镜、电话等)。我们的背景带代理在面对不确定性时展示乐观态度,从而有效地探索了环境,并学习了从每个导航地点发现不同物体的可能性模型。这些可能性被作为加权最小拉特度解算器的奖励,用以推断机器人的轨迹。我们评估了在两个模拟环境中和真实世界环境中的算法,以显示高采样效率和可靠性。

0

相关内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

miR-497双重靶向抑制Akt/mTOR/P70s6k信号通路逆转卵巢癌顺铂耐药的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

miR-506多靶点调控HR和β-catenin信号通路对浆液性卵巢癌药物敏感性的影响

国家自然科学基金

0+阅读 · 2014年12月31日

靶向活化SIRT1调节tau外显子10可变剪接在阿尔茨海默病防治中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

蜂窝D2D异构网络Any-cast混合业务无线资源管理关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

新型叶酸受体介导的靶向有序超分子膜层层自组装纳米超声造影微泡构建的研究

国家自然科学基金

0+阅读 · 2011年12月31日

单层石墨纳米结构与循环肿瘤细胞的界面研究

国家自然科学基金

0+阅读 · 2011年12月31日

TfR抗体和CTX修饰纳米载体介导hTERTC27治疗神经胶质瘤

国家自然科学基金

0+阅读 · 2009年12月31日

Learning to be Fair: A Consequentialist Approach to Equitable Decision-Making

Learning to be Fair: A Consequentialist Approach to Equitable Decision-Making

Arxiv

0+阅读 · 2023年2月1日

Navigating in the Dark -- Designing Autonomous Driving Features to Assist Old Adults with Visual Impairments

Arxiv

0+阅读 · 2023年2月1日

Active Uncertainty Reduction for Safe and Efficient Interaction Planning: A Shielding-Aware Dual Control Approach

Arxiv

0+阅读 · 2023年2月1日

Adaptively Weighted Data Augmentation Consistency Regularization for Robust Optimization under Concept Shift

Arxiv

0+阅读 · 2023年1月31日

A Computationally Efficient Approach to Fully Bayesian Benchmarking

Arxiv

0+阅读 · 2023年1月29日

The cost of coordination can exceed the benefit of collaboration in performing complex tasks

Arxiv

0+阅读 · 2023年1月27日

Statistical Inference for the Dynamic Time Warping Distance, with Application to Abnormal Time-Series Detection

Arxiv

0+阅读 · 2023年1月27日

Learning the Effects of Physical Actions in a Multi-modal Environment

Arxiv

0+阅读 · 2023年1月27日

Top-down and bottom-up approaches to video Quality of Experience studies; overview and proposal of a new model

Arxiv

0+阅读 · 2023年1月27日

Time-Series Event Prediction with Evolutionary State Graph

Arxiv

14+阅读 · 2020年11月25日

VIP会员

文章信息

相关主题

上下文赌博机/上下文老虎机

赌博机/老虎机

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战略分析：面向国防与国际安全的建模与仿真》

《俄乌战争中影响力行动的社交媒体分析》2025最新69页

什么是模块化开放系统方法（MOSA）？从美陆军新型倾转旋翼机视角解读

《用于评估军事作战场景的仿真环境》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Learning to be Fair: A Consequentialist Approach to Equitable Decision-Making

Learning to be Fair: A Consequentialist Approach to Equitable Decision-Making

Arxiv

0+阅读 · 2023年2月1日

Navigating in the Dark -- Designing Autonomous Driving Features to Assist Old Adults with Visual Impairments

Arxiv

0+阅读 · 2023年2月1日

Active Uncertainty Reduction for Safe and Efficient Interaction Planning: A Shielding-Aware Dual Control Approach

Arxiv

0+阅读 · 2023年2月1日

Adaptively Weighted Data Augmentation Consistency Regularization for Robust Optimization under Concept Shift

Arxiv

0+阅读 · 2023年1月31日

A Computationally Efficient Approach to Fully Bayesian Benchmarking

Arxiv

0+阅读 · 2023年1月29日

The cost of coordination can exceed the benefit of collaboration in performing complex tasks

Arxiv

0+阅读 · 2023年1月27日

Statistical Inference for the Dynamic Time Warping Distance, with Application to Abnormal Time-Series Detection

Arxiv

0+阅读 · 2023年1月27日

Learning the Effects of Physical Actions in a Multi-modal Environment

Arxiv

0+阅读 · 2023年1月27日

Top-down and bottom-up approaches to video Quality of Experience studies; overview and proposal of a new model

Arxiv

0+阅读 · 2023年1月27日

Time-Series Event Prediction with Evolutionary State Graph

Arxiv

14+阅读 · 2020年11月25日

相关基金

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

miR-497双重靶向抑制Akt/mTOR/P70s6k信号通路逆转卵巢癌顺铂耐药的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

miR-506多靶点调控HR和β-catenin信号通路对浆液性卵巢癌药物敏感性的影响

国家自然科学基金

0+阅读 · 2014年12月31日

靶向活化SIRT1调节tau外显子10可变剪接在阿尔茨海默病防治中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

蜂窝D2D异构网络Any-cast混合业务无线资源管理关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

新型叶酸受体介导的靶向有序超分子膜层层自组装纳米超声造影微泡构建的研究

国家自然科学基金

0+阅读 · 2011年12月31日

单层石墨纳米结构与循环肿瘤细胞的界面研究

国家自然科学基金

0+阅读 · 2011年12月31日

TfR抗体和CTX修饰纳米载体介导hTERTC27治疗神经胶质瘤

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员