ATTACH数据集：注释的用于理解人类动作的双手装配动作数据集 (ATTACH Dataset: Annotated Two-Handed Assembly Actions for Human Action Understanding) - 专知论文

会员服务 ·

0

注释（编程） · 数据集 · 协作 · 骨架 · 包含 ·

2023 年 4 月 17 日

ATTACH Dataset: Annotated Two-Handed Assembly Actions for Human Action Understanding

翻译：ATTACH数据集：注释的用于理解人类动作的双手装配动作数据集

Dustin Aganian,Benedict Stephan,Markus Eisenbach,Corinna Stretz,Horst-Michael Gross

from arxiv, IEEE International Conference on Robotics and Automation (ICRA) 2023

With the emergence of collaborative robots (cobots), human-robot collaboration in industrial manufacturing is coming into focus. For a cobot to act autonomously and as an assistant, it must understand human actions during assembly. To effectively train models for this task, a dataset containing suitable assembly actions in a realistic setting is crucial. For this purpose, we present the ATTACH dataset, which contains 51.6 hours of assembly with 95.2k annotated fine-grained actions monitored by three cameras, which represent potential viewpoints of a cobot. Since in an assembly context workers tend to perform different actions simultaneously with their two hands, we annotated the performed actions for each hand separately. Therefore, in the ATTACH dataset, more than 68% of annotations overlap with other annotations, which is many times more than in related datasets, typically featuring more simplistic assembly tasks. For better generalization with respect to the background of the working area, we did not only record color and depth images, but also used the Azure Kinect body tracking SDK for estimating 3D skeletons of the worker. To create a first baseline, we report the performance of state-of-the-art methods for action recognition as well as action detection on video and skeleton-sequence inputs. The dataset is available at https://www.tu-ilmenau.de/neurob/data-sets-code/attach-dataset .

翻译：随着协作机器人的出现，工业制造业中的人机协作备受关注。为了让协作机器人能够自主地充当助手，它必须能够理解装配过程中人类的动作。为有效训练这项任务的模型，需要一个包含适合实际场景的装配动作的数据集。为此，我们介绍了ATTACH数据集，包含51.6小时、95.2k个注释细粒度装配动作，由三个摄像机监测，这三个摄像机代表协作机器人的可能视角。由于在装配背景中，工人往往会同时用双手执行不同的动作，因此我们为每只手单独注释其执行的动作。因此，在ATTACH数据集中，超过68％的注释与其他注释重叠，这比其他相关数据集通常包含更简单的装配任务要多得多。为了更好地泛化到工作区域的背景，我们不仅记录了彩色和深度图像，还使用Azure Kinect体跟踪软件开发包来估计工人的3D骨架。为了创建第一个基准线，我们报告了目前最先进的方法在视频和骨架序列输入上进行的动作识别和检测的性能。该数据集可在https://www.tu-ilmenau.de/neurob/data-sets-code/attach-dataset上获取。

1

相关内容

注释（编程）

注释（编程）

注释（编程）

【TNNLS2022】SGCPNet: 面向实时语义分割的空间细节引导上下文传播网络

【TNNLS2022】SGCPNet: 面向实时语义分割的空间细节引导上下文传播网络

专知会员服务

24+阅读 · 2022年4月8日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

专知会员服务

12+阅读 · 2022年3月9日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【AAAI2021】面向真实世界的鲁棒视觉信息提取:新的数据集和新颖的解决方案

【AAAI2021】面向真实世界的鲁棒视觉信息提取:新的数据集和新颖的解决方案

专知会员服务

22+阅读 · 2021年2月17日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

专知会员服务

18+阅读 · 2019年12月14日

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

专知会员服务

33+阅读 · 2019年11月28日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

全景分割这一年，端到端之路

全景分割这一年，端到端之路

机器之心

14+阅读 · 2018年12月24日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【泡泡一分钟】Matterport3D: 从室内RGBD数据集中训练 (3dv-22)

【泡泡一分钟】Matterport3D: 从室内RGBD数据集中训练 (3dv-22)

泡泡机器人SLAM

16+阅读 · 2017年12月31日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

基于sEMG信号的下肢康复机器人肌力预测模型与交互式自适应阻抗控制研究

国家自然科学基金

0+阅读 · 2014年12月31日

原子层沉积技术对锂离子电池电极材料的表面处理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Markov方法的大规模多阶段任务系统可靠性建模与分析

国家自然科学基金

1+阅读 · 2013年12月31日

大白菜KIN基因的表达及其pre-mRNA加工机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

视频情感理解及在互联网恐怖视频识别中的应用

国家自然科学基金

1+阅读 · 2013年12月31日

基于深度学习的时序3D深度图动作语义理解

国家自然科学基金

2+阅读 · 2013年12月31日

具有冗余自由度的汽车喷涂机器人节能降耗机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

弹性冗余空间机构性能与机器精度模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

自然场景下机器人大范围视觉伺服研究

国家自然科学基金

1+阅读 · 2012年12月31日

VI型分泌系统（T6SS）在溶藻弧菌中调控碱性丝氨酸蛋白酶合成/分泌的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

Vocabulary-free Image Classification

Arxiv

0+阅读 · 2023年6月1日

Grounding Language Models to Images for Multimodal Inputs and Outputs

Arxiv

0+阅读 · 2023年6月1日

ILLUME: Rationalizing Vision-Language Models through Human Interactions

Arxiv

0+阅读 · 2023年5月31日

Adversarial Detection: Attacking Object Detection in Real Time

Arxiv

0+阅读 · 2023年5月31日

Instrumental genesis through interdisciplinary collaboration -- reflections on the emergence of a visualisation framework for video annotation data

Arxiv

0+阅读 · 2023年5月30日

An Annotated Dataset for Explainable Interpersonal Risk Factors of Mental Disturbance in Social Media Posts

Arxiv

0+阅读 · 2023年5月30日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

3D Object Detection for Autonomous Driving: A Survey

Arxiv

12+阅读 · 2021年6月21日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

VIP会员

文章信息

相关主题

注释（编程）

相关VIP内容

【TNNLS2022】SGCPNet: 面向实时语义分割的空间细节引导上下文传播网络

【TNNLS2022】SGCPNet: 面向实时语义分割的空间细节引导上下文传播网络

专知会员服务

24+阅读 · 2022年4月8日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

专知会员服务

12+阅读 · 2022年3月9日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【AAAI2021】面向真实世界的鲁棒视觉信息提取:新的数据集和新颖的解决方案

【AAAI2021】面向真实世界的鲁棒视觉信息提取:新的数据集和新颖的解决方案

专知会员服务

22+阅读 · 2021年2月17日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

专知会员服务

18+阅读 · 2019年12月14日

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

专知会员服务

33+阅读 · 2019年11月28日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

全景分割这一年，端到端之路

全景分割这一年，端到端之路

机器之心

14+阅读 · 2018年12月24日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【泡泡一分钟】Matterport3D: 从室内RGBD数据集中训练 (3dv-22)

【泡泡一分钟】Matterport3D: 从室内RGBD数据集中训练 (3dv-22)

泡泡机器人SLAM

16+阅读 · 2017年12月31日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Vocabulary-free Image Classification

Arxiv

0+阅读 · 2023年6月1日

Grounding Language Models to Images for Multimodal Inputs and Outputs

Arxiv

0+阅读 · 2023年6月1日

ILLUME: Rationalizing Vision-Language Models through Human Interactions

Arxiv

0+阅读 · 2023年5月31日

Adversarial Detection: Attacking Object Detection in Real Time

Arxiv

0+阅读 · 2023年5月31日

Instrumental genesis through interdisciplinary collaboration -- reflections on the emergence of a visualisation framework for video annotation data

Arxiv

0+阅读 · 2023年5月30日

An Annotated Dataset for Explainable Interpersonal Risk Factors of Mental Disturbance in Social Media Posts

Arxiv

0+阅读 · 2023年5月30日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

3D Object Detection for Autonomous Driving: A Survey

Arxiv

12+阅读 · 2021年6月21日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

相关基金

基于sEMG信号的下肢康复机器人肌力预测模型与交互式自适应阻抗控制研究

国家自然科学基金

0+阅读 · 2014年12月31日

原子层沉积技术对锂离子电池电极材料的表面处理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Markov方法的大规模多阶段任务系统可靠性建模与分析

国家自然科学基金

1+阅读 · 2013年12月31日

大白菜KIN基因的表达及其pre-mRNA加工机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

视频情感理解及在互联网恐怖视频识别中的应用

国家自然科学基金

1+阅读 · 2013年12月31日

基于深度学习的时序3D深度图动作语义理解

国家自然科学基金

2+阅读 · 2013年12月31日

具有冗余自由度的汽车喷涂机器人节能降耗机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

弹性冗余空间机构性能与机器精度模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

自然场景下机器人大范围视觉伺服研究

国家自然科学基金

1+阅读 · 2012年12月31日

VI型分泌系统（T6SS）在溶藻弧菌中调控碱性丝氨酸蛋白酶合成/分泌的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员