通过变形器变形物体变形物体一般可学习的视觉-触觉机器人测深战略 (Learning Generalizable Vision-Tactile Robotic Grasping Strategy for Deformable Objects via Transformer) - 专知论文

会员服务 ·

0

Transformer模型 · 变换 · MoDELS · 深度前馈网络 · 学成 ·

2021 年 12 月 13 日

Learning Generalizable Vision-Tactile Robotic Grasping Strategy for Deformable Objects via Transformer

翻译：通过变形器变形物体变形物体一般可学习的视觉-触觉机器人测深战略

Yunhai Han,Rahul Batra,Nathan Boyd,Tuo Zhao,Yu She,Seth Hutchinson,Ye Zhao

from arxiv, This paper is submitted to RA-L

Reliable robotic grasping, especially with deformable objects, (e.g. fruit), remains a challenging task due to underactuated contact interactions with a gripper, unknown object dynamics, and variable object geometries. In this study, we propose a Transformer-based robotic grasping framework for rigid grippers that leverage tactile and visual information for safe object grasping. Specifically, the Transformer models learn physical feature embeddings with sensor feedback through performing two pre-defined explorative actions (pinching and sliding) and predict a final grasping outcome through a multilayer perceptron (MLP) with a given grasping strength. Using these predictions, the gripper is commanded with a safe grasping strength for the grasping tasks via inference. Compared with convolutional recurrent networks, the Transformer models can capture the long-term dependencies across the image sequences and process the spatial-temporal features simultaneously. We first benchmark the proposed Transformer models on a public dataset for slip detection. Following that, we show that the Transformer models outperform a CNN+LSTM model in terms of grasping accuracy and computational efficiency. We also collect our own fruit grasping dataset and conduct the online grasping experiments using the proposed framework for both seen and unseen fruits. Our codes and dataset are made public on GitHub.

翻译：可靠的机器人捕捉,特别是变形物体(例如水果)的捕捉,仍然是一项艰巨的任务,因为与抓取器、不明物体动态和可变天体的几何的接触互动不足。在本研究中,我们提议为僵硬的抓抓器建立一个以变压器为基础的机器人捕捉框架,以利用触动和视觉信息来安全捕捉物体。具体地说,变压器模型通过执行两个预先定义的探索动作(悬浮和滑动),学习与传感器反馈相嵌的物理特征,并预测通过一个具有一定捕捉力的多层透视器(MLP)获得最后的捕捉捉摸结果。使用这些预测,控制器被以安全掌握掌握能力来通过猜测掌握任务。与变压器经常网络相比,变压器模型可以同时捕捉图像序列之间的长期依赖性关系,并处理空间时空特征。我们首先将拟议的变压器模型基准放在一个公共数据集上,以便侦测。随后,我们显示变压器模型在掌握我们公共的精确度和掌握结果的模型上,同时收集我们所看到的数据。

0

相关内容

Transformer模型

Transformer模型

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【ICCV 2021】OadTR框架：基于Transformers的在线行为检测任务

专知会员服务

10+阅读 · 2021年9月11日

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

专知会员服务

22+阅读 · 2021年4月20日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

视觉机械臂 visual-pushing-grasping

视觉机械臂 visual-pushing-grasping

CreateAMind

3+阅读 · 2018年5月25日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行

【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行

专知

6+阅读 · 2018年1月15日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

O-ViT: Orthogonal Vision Transformer

O-ViT: Orthogonal Vision Transformer

Arxiv

0+阅读 · 2022年2月16日

DelTact: A Vision-based Tactile Sensor Using Dense Color Pattern

Arxiv

0+阅读 · 2022年2月15日

Tac3D: A Novel Vision-based Tactile Sensor for Measuring Forces Distribution and Estimating Friction Coefficient Distribution

Arxiv

0+阅读 · 2022年2月13日

Grasp Control of a Cable-Driven Robotic Hand Using a PVDF Slip Detection Sensor

Arxiv

0+阅读 · 2022年2月12日

End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB

Arxiv

0+阅读 · 2022年2月11日

TACTO: A Fast, Flexible, and Open-source Simulator for High-Resolution Vision-based Tactile Sensors

Arxiv

0+阅读 · 2022年2月10日

Generative Video Transformer: Can Objects be the Words?

Arxiv

6+阅读 · 2021年7月20日

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Arxiv

7+阅读 · 2021年3月22日

Vision-based Robotic Grasping from Object Localization, Pose Estimation, Grasp Detection to Motion Planning: A Review

Vision-based Robotic Grasping from Object Localization, Pose Estimation, Grasp Detection to Motion Planning: A Review

Arxiv

6+阅读 · 2019年5月16日

Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning

Arxiv

6+阅读 · 2019年2月27日

VIP会员

文章信息

相关主题

Transformer模型

深度前馈网络

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【ICCV 2021】OadTR框架：基于Transformers的在线行为检测任务

专知会员服务

10+阅读 · 2021年9月11日

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

【CVPR 2021】变换器跟踪TransT: Transformer Tracking

专知会员服务

22+阅读 · 2021年4月20日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

视觉机械臂 visual-pushing-grasping

视觉机械臂 visual-pushing-grasping

CreateAMind

3+阅读 · 2018年5月25日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行

【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行

专知

6+阅读 · 2018年1月15日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

相关论文

O-ViT: Orthogonal Vision Transformer

O-ViT: Orthogonal Vision Transformer

Arxiv

0+阅读 · 2022年2月16日

DelTact: A Vision-based Tactile Sensor Using Dense Color Pattern

Arxiv

0+阅读 · 2022年2月15日

Tac3D: A Novel Vision-based Tactile Sensor for Measuring Forces Distribution and Estimating Friction Coefficient Distribution

Arxiv

0+阅读 · 2022年2月13日

Grasp Control of a Cable-Driven Robotic Hand Using a PVDF Slip Detection Sensor

Arxiv

0+阅读 · 2022年2月12日

End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB

Arxiv

0+阅读 · 2022年2月11日

TACTO: A Fast, Flexible, and Open-source Simulator for High-Resolution Vision-based Tactile Sensors

Arxiv

0+阅读 · 2022年2月10日

Generative Video Transformer: Can Objects be the Words?

Arxiv

6+阅读 · 2021年7月20日

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Arxiv

7+阅读 · 2021年3月22日

Vision-based Robotic Grasping from Object Localization, Pose Estimation, Grasp Detection to Motion Planning: A Review

Vision-based Robotic Grasping from Object Localization, Pose Estimation, Grasp Detection to Motion Planning: A Review

Arxiv

6+阅读 · 2019年5月16日

Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning

Arxiv

6+阅读 · 2019年2月27日

微信扫码咨询专知VIP会员