VSGM -- -- 通过视觉语义图图增强机器人任务理解能力 (VSGM -- Enhance robot task understanding ability through visual semantic graph) - 专知论文

会员服务 ·

0

可理解性 · 图 · 机器人 · INTERACT · Performer ·

2021 年 5 月 19 日

VSGM -- Enhance robot task understanding ability through visual semantic graph

翻译：VSGM -- -- 通过视觉语义图图增强机器人任务理解能力

Cheng Yu Tsai,Mu-Chun Su

from arxiv, 16 pages, 7 figures

In recent years, developing AI for robotics has raised much attention. The interaction of vision and language of robots is particularly difficult. We consider that giving robots an understanding of visual semantics and language semantics will improve inference ability. In this paper, we propose a novel method-VSGM (Visual Semantic Graph Memory), which uses the semantic graph to obtain better visual image features, improve the robot's visual understanding ability. By providing prior knowledge of the robot and detecting the objects in the image, it predicts the correlation between the attributes of the object and the objects and converts them into a graph-based representation; and mapping the object in the image to be a top-down egocentric map. Finally, the important object features of the current task are extracted by Graph Neural Networks. The method proposed in this paper is verified in the ALFRED (Action Learning From Realistic Environments and Directives) dataset. In this dataset, the robot needs to perform daily indoor household tasks following the required language instructions. After the model is added to the VSGM, the task success rate can be improved by 6~10%.

翻译：近些年来,开发机器人的人工智能引起了人们的极大关注。机器人的视觉和语言的相互作用特别困难。我们认为,让机器人了解视觉语义和语言语义将提高推论能力。在本文中,我们建议采用新颖的方法VSGM(视觉语义图像内存),使用语义图获得更好的视觉图像特征,提高机器人的视觉理解能力。通过提供机器人先前的知识并探测图像中的天体,它预测了天体属性与天体的关联性,并将其转换成图形表示法;在图像中绘制天体图,成为自上而下的自我中心图。最后,当前任务的重要对象特征由图形神经网络提取。本文中提议的方法在ALFRED(从现实环境和指令中学习的行动)数据集中得到验证。在这个数据集中,机器人需要按照所需的语言指示执行日常室内任务。在VSGGM中添加模型后,任务成功率可以提高6-10%。

0

相关内容

可理解性

【NUS-Xavier 教授】图神经网络应用概述，15页ppt

专知会员服务

52+阅读 · 2021年6月30日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

专知会员服务

90+阅读 · 2020年7月9日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

52+阅读 · 2020年5月26日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

专知会员服务

29+阅读 · 2020年4月17日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

已删除

将门创投

7+阅读 · 2020年3月13日

Multilingual Knowledge Graph Completion via Ensemble Knowledge Transfer

Arxiv

4+阅读 · 2020年10月8日

Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction

Arxiv

9+阅读 · 2019年10月12日

Human-centric Transfer Learning Explanation via Knowledge Graph [Extended Abstract]

Arxiv

3+阅读 · 2019年1月20日

Visual Semantic Navigation using Scene Priors

Arxiv

5+阅读 · 2018年10月15日

Learning Sequence Encoders for Temporal Knowledge Graph Completion

Arxiv

6+阅读 · 2018年9月10日

Learning Conditioned Graph Structures for Interpretable Visual Question Answering

Learning Conditioned Graph Structures for Interpretable Visual Question Answering

Arxiv

5+阅读 · 2018年7月5日

R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering

Arxiv

7+阅读 · 2018年5月24日

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

Arxiv

5+阅读 · 2018年4月5日

Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering

Arxiv

5+阅读 · 2018年3月23日

Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

Arxiv

4+阅读 · 2018年2月1日

VIP会员

文章信息

相关主题

相关VIP内容

【NUS-Xavier 教授】图神经网络应用概述，15页ppt

专知会员服务

52+阅读 · 2021年6月30日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

专知会员服务

90+阅读 · 2020年7月9日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

52+阅读 · 2020年5月26日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

专知会员服务

29+阅读 · 2020年4月17日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

已删除

将门创投

7+阅读 · 2020年3月13日

相关论文

Multilingual Knowledge Graph Completion via Ensemble Knowledge Transfer

Arxiv

4+阅读 · 2020年10月8日

Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction

Arxiv

9+阅读 · 2019年10月12日

Human-centric Transfer Learning Explanation via Knowledge Graph [Extended Abstract]

Arxiv

3+阅读 · 2019年1月20日

Visual Semantic Navigation using Scene Priors

Arxiv

5+阅读 · 2018年10月15日

Learning Sequence Encoders for Temporal Knowledge Graph Completion

Arxiv

6+阅读 · 2018年9月10日

Learning Conditioned Graph Structures for Interpretable Visual Question Answering

Learning Conditioned Graph Structures for Interpretable Visual Question Answering

Arxiv

5+阅读 · 2018年7月5日

R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering

Arxiv

7+阅读 · 2018年5月24日

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

Arxiv

5+阅读 · 2018年4月5日

Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering

Arxiv

5+阅读 · 2018年3月23日

Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

Arxiv

4+阅读 · 2018年2月1日

微信扫码咨询专知VIP会员