manipulathor:视觉物体操纵框架 (ManipulaTHOR: A Framework for Visual Object Manipulation) - 专知论文

会员服务 ·

0

回合 · INTERACT · 泛化理论 · 块 · 学成 ·

2021 年 4 月 22 日

ManipulaTHOR: A Framework for Visual Object Manipulation

翻译：manipulathor:视觉物体操纵框架

Kiana Ehsani,Winson Han,Alvaro Herrasti,Eli VanderBilt,Luca Weihs,Eric Kolve,Aniruddha Kembhavi,Roozbeh Mottaghi

from arxiv, CVPR 2021 -- (Oral presentation)

The domain of Embodied AI has recently witnessed substantial progress, particularly in navigating agents within their environments. These early successes have laid the building blocks for the community to tackle tasks that require agents to actively interact with objects in their environment. Object manipulation is an established research domain within the robotics community and poses several challenges including manipulator motion, grasping and long-horizon planning, particularly when dealing with oft-overlooked practical setups involving visually rich and complex scenes, manipulation using mobile agents (as opposed to tabletop manipulation), and generalization to unseen environments and objects. We propose a framework for object manipulation built upon the physics-enabled, visually rich AI2-THOR framework and present a new challenge to the Embodied AI community known as ArmPointNav. This task extends the popular point navigation task to object manipulation and offers new challenges including 3D obstacle avoidance, manipulating objects in the presence of occlusion, and multi-object manipulation that necessitates long term planning. Popular learning paradigms that are successful on PointNav challenges show promise, but leave a large room for improvement.

翻译：物体操纵是机器人界内部一个固定的研究领域,它提出了几项挑战,包括操纵运动、掌握和长视横向规划,特别是在处理视觉丰富和复杂场景、使用移动剂操作(而不是桌面操纵)和对看不见的环境和物体进行一般化处理时。我们提议了一个基于物理学、视觉丰富的AI2-THOR框架的物体操纵框架,并对称为ArmPointNav的Embodi AI社区提出了新的挑战。这项任务将流行点导航任务扩大到物体操纵,并提出了新的挑战,包括3D避免障碍、在隐蔽和复杂场景面前操纵物体以及需要长期规划的多点操纵。在PointNav上成功的大众学习范例显示了希望,但留下了很大的改进空间。

0

相关内容

【干货书】机器人元素Elements of Robotics ，311页pdf

【干货书】机器人元素Elements of Robotics ，311页pdf

专知会员服务

38+阅读 · 2021年4月16日

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

专知会员服务

14+阅读 · 2020年6月18日

【视频目标检测与跟踪：综述论文】Video Object Segmentation and Tracking: A Survey

专知会员服务

66+阅读 · 2020年6月4日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

52+阅读 · 2020年5月26日

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

专知会员服务

27+阅读 · 2020年1月17日

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

专知会员服务

46+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

carla无人驾驶模拟中文项目 carla_simulator_Chinese

carla无人驾驶模拟中文项目 carla_simulator_Chinese

CreateAMind

3+阅读 · 2018年1月30日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

Keyframe-Focused Visual Imitation Learning

Arxiv

0+阅读 · 2021年6月11日

A modular framework for object-based saccadic decisions in dynamic scenes

Arxiv

0+阅读 · 2021年6月10日

ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation

Arxiv

0+阅读 · 2021年6月5日

GOO: A Dataset for Gaze Object Prediction in Retail Environments

Arxiv

0+阅读 · 2021年5月22日

Improving CNN-based Planar Object Detection with Geometric Prior Knowledge

Improving CNN-based Planar Object Detection with Geometric Prior Knowledge

Arxiv

6+阅读 · 2019年9月23日

Vision-based Robotic Grasping from Object Localization, Pose Estimation, Grasp Detection to Motion Planning: A Review

Vision-based Robotic Grasping from Object Localization, Pose Estimation, Grasp Detection to Motion Planning: A Review

Arxiv

6+阅读 · 2019年5月16日

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Arxiv

5+阅读 · 2019年2月26日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Arxiv

19+阅读 · 2018年1月27日

Long-Term Visual Object Tracking Benchmark

Arxiv

7+阅读 · 2017年12月28日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器人元素Elements of Robotics ，311页pdf

【干货书】机器人元素Elements of Robotics ，311页pdf

专知会员服务

38+阅读 · 2021年4月16日

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

专知会员服务

14+阅读 · 2020年6月18日

【视频目标检测与跟踪：综述论文】Video Object Segmentation and Tracking: A Survey

专知会员服务

66+阅读 · 2020年6月4日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

52+阅读 · 2020年5月26日

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

专知会员服务

27+阅读 · 2020年1月17日

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

专知会员服务

46+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《关于俄乌战争的系列文章》2025最新70页

《军事行动中的人机AI编队本体模型》

更智能的人工智能实现更快速的电磁辐射控制（EMCON）

《俄罗斯常规军队能力现状及重建》2025最新124页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

carla无人驾驶模拟中文项目 carla_simulator_Chinese

carla无人驾驶模拟中文项目 carla_simulator_Chinese

CreateAMind

3+阅读 · 2018年1月30日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Keyframe-Focused Visual Imitation Learning

Arxiv

0+阅读 · 2021年6月11日

A modular framework for object-based saccadic decisions in dynamic scenes

Arxiv

0+阅读 · 2021年6月10日

ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation

Arxiv

0+阅读 · 2021年6月5日

GOO: A Dataset for Gaze Object Prediction in Retail Environments

Arxiv

0+阅读 · 2021年5月22日

Improving CNN-based Planar Object Detection with Geometric Prior Knowledge

Improving CNN-based Planar Object Detection with Geometric Prior Knowledge

Arxiv

6+阅读 · 2019年9月23日

Vision-based Robotic Grasping from Object Localization, Pose Estimation, Grasp Detection to Motion Planning: A Review

Vision-based Robotic Grasping from Object Localization, Pose Estimation, Grasp Detection to Motion Planning: A Review

Arxiv

6+阅读 · 2019年5月16日

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Arxiv

5+阅读 · 2019年2月26日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Arxiv

19+阅读 · 2018年1月27日

Long-Term Visual Object Tracking Benchmark

Arxiv

7+阅读 · 2017年12月28日

微信扫码咨询专知VIP会员