深度多视角多任务分类框架——OO-dMVMT：适用于实时三维手势分类和分割 (OO-dMVMT: A Deep Multi-view Multi-task Classification Framework for Real-time 3D Hand Gesture Classification and Segmentation) - 专知论文

会员服务 ·

0

多视角 · 分割 · 低延迟 · 增强现实 · 虚拟现实 ·

2023 年 4 月 12 日

OO-dMVMT: A Deep Multi-view Multi-task Classification Framework for Real-time 3D Hand Gesture Classification and Segmentation

翻译：深度多视角多任务分类框架——OO-dMVMT：适用于实时三维手势分类和分割

Federico Cunico,Federico Girella,Andrea Avogaro,Marco Emporio,Andrea Giachetti,Marco Cristani

from arxiv, Accepted to the Computer Vision for Mixed Reality workshop at CVPR 2023

Continuous mid-air hand gesture recognition based on captured hand pose streams is fundamental for human-computer interaction, particularly in AR / VR. However, many of the methods proposed to recognize heterogeneous hand gestures are tested only on the classification task, and the real-time low-latency gesture segmentation in a continuous stream is not well addressed in the literature. For this task, we propose the On-Off deep Multi-View Multi-Task paradigm (OO-dMVMT). The idea is to exploit multiple time-local views related to hand pose and movement to generate rich gesture descriptions, along with using heterogeneous tasks to achieve high accuracy. OO-dMVMT extends the classical MVMT paradigm, where all of the multiple tasks have to be active at each time, by allowing specific tasks to switch on/off depending on whether they can apply to the input. We show that OO-dMVMT defines the new SotA on continuous/online 3D skeleton-based gesture recognition in terms of gesture classification accuracy, segmentation accuracy, false positives, and decision latency while maintaining real-time operation.

翻译：手势识别作为基于手势姿势流的人机交互基础，对于增强现实/虚拟现实等应用至关重要。然而，现有方法往往仅在分类任务上进行测试，并未很好地解决持续流中的实时低延迟手势分割。针对这一问题，我们提出了基于On-Off深度多视角多任务（OO-dMVMT）的解决方案。该方法的主要思路是利用与手部姿势和运动相关的多个时空视角生成丰富的手势描述，同时利用多个任务实现高准确率。基于OO-dMVMT框架，我们对传统的多任务多视角（MVMT）范式进行了扩展，使得特定任务可以根据输入是否适用而开关。研究结果表明，OO-dMVMT具有分类准确性、分割准确性、误判和决策延迟的领先水平，并且保持实时操作。

0

相关内容

多视角

【TPAMI】从人机对抗提出视觉跟踪智能评估新方法，Global Instance Tracking: Locating Target More Like Humans

【TPAMI】从人机对抗提出视觉跟踪智能评估新方法，Global Instance Tracking: Locating Target More Like Humans

专知会员服务

22+阅读 · 2022年3月29日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【CVPR2021】用于目标检测的通用实例蒸馏

【CVPR2021】用于目标检测的通用实例蒸馏

专知会员服务

24+阅读 · 2021年3月22日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

专知会员服务

93+阅读 · 2020年4月11日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

专知会员服务

64+阅读 · 2020年2月16日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

CVPR 2021 论文盘点-人脸识别篇

CVPR 2021 论文盘点-人脸识别篇

CVer

2+阅读 · 2022年5月25日

【泡泡一分钟】用于视角可变重定位的语义地图构建

【泡泡一分钟】用于视角可变重定位的语义地图构建

泡泡机器人SLAM

19+阅读 · 2019年10月21日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

miR-21抑制PTEN介导心脏衰老

国家自然科学基金

0+阅读 · 2016年12月31日

杆状病毒诱导宿主细胞Arp2/3复合体发生核转运的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于回归的视角转换框架下的多视角行人步态识别

国家自然科学基金

2+阅读 · 2014年12月31日

基于主动表观模型的MR脑图像海马自动识别和三维分割法联合fMRI多模态成像模式用于AD早期诊断

国家自然科学基金

0+阅读 · 2013年12月31日

人巨细胞病毒潜伏感染的自噬调控及相关IE2-Akt-Beclin 1通路的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于视觉注意的手势交互技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于增强现实的精确截骨手术导航系统

国家自然科学基金

1+阅读 · 2012年12月31日

四跨膜蛋白CD151与Co-029对TNFα/TNFαR1系统介导的肝细胞癌侵袭与转移的调控研究

国家自然科学基金

0+阅读 · 2011年12月31日

肝移植后缺血性胆道病变超声造影早期诊断的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于2D视频视觉关注度的3D重建方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

EmotionGesture: Audio-Driven Diverse Emotional Co-Speech 3D Gesture Generation

Arxiv

0+阅读 · 2023年5月30日

HiFA: High-fidelity Text-to-3D with Advanced Diffusion Guidance

Arxiv

0+阅读 · 2023年5月30日

Graph Neural Networks for Text Classification: A Survey

Arxiv

34+阅读 · 2023年4月27日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Arxiv

101+阅读 · 2022年5月11日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Multi-Label Text Classification using Attention-based Graph Neural Network

Arxiv

46+阅读 · 2020年3月22日

Image Segmentation Using Deep Learning: A Survey

Image Segmentation Using Deep Learning: A Survey

Arxiv

47+阅读 · 2020年1月15日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

VIP会员

文章信息

相关主题

相关VIP内容

【TPAMI】从人机对抗提出视觉跟踪智能评估新方法，Global Instance Tracking: Locating Target More Like Humans

【TPAMI】从人机对抗提出视觉跟踪智能评估新方法，Global Instance Tracking: Locating Target More Like Humans

专知会员服务

22+阅读 · 2022年3月29日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【CVPR2021】用于目标检测的通用实例蒸馏

【CVPR2021】用于目标检测的通用实例蒸馏

专知会员服务

24+阅读 · 2021年3月22日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

专知会员服务

93+阅读 · 2020年4月11日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

专知会员服务

64+阅读 · 2020年2月16日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

发射器定位中的传感器路径规划研究 | 235页

战略无人机 | 2025最新80页

蜂窝通信是否是无人机与无人地面战车主宰战场的关键？

无人机对机动战的影响 | 2025最新文献

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

CVPR 2021 论文盘点-人脸识别篇

CVPR 2021 论文盘点-人脸识别篇

CVer

2+阅读 · 2022年5月25日

【泡泡一分钟】用于视角可变重定位的语义地图构建

【泡泡一分钟】用于视角可变重定位的语义地图构建

泡泡机器人SLAM

19+阅读 · 2019年10月21日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

EmotionGesture: Audio-Driven Diverse Emotional Co-Speech 3D Gesture Generation

Arxiv

0+阅读 · 2023年5月30日

HiFA: High-fidelity Text-to-3D with Advanced Diffusion Guidance

Arxiv

0+阅读 · 2023年5月30日

Graph Neural Networks for Text Classification: A Survey

Arxiv

34+阅读 · 2023年4月27日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Arxiv

101+阅读 · 2022年5月11日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Multi-Label Text Classification using Attention-based Graph Neural Network

Arxiv

46+阅读 · 2020年3月22日

Image Segmentation Using Deep Learning: A Survey

Image Segmentation Using Deep Learning: A Survey

Arxiv

47+阅读 · 2020年1月15日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

相关基金

miR-21抑制PTEN介导心脏衰老

国家自然科学基金

0+阅读 · 2016年12月31日

杆状病毒诱导宿主细胞Arp2/3复合体发生核转运的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于回归的视角转换框架下的多视角行人步态识别

国家自然科学基金

2+阅读 · 2014年12月31日

基于主动表观模型的MR脑图像海马自动识别和三维分割法联合fMRI多模态成像模式用于AD早期诊断

国家自然科学基金

0+阅读 · 2013年12月31日

人巨细胞病毒潜伏感染的自噬调控及相关IE2-Akt-Beclin 1通路的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于视觉注意的手势交互技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于增强现实的精确截骨手术导航系统

国家自然科学基金

1+阅读 · 2012年12月31日

四跨膜蛋白CD151与Co-029对TNFα/TNFαR1系统介导的肝细胞癌侵袭与转移的调控研究

国家自然科学基金

0+阅读 · 2011年12月31日

肝移植后缺血性胆道病变超声造影早期诊断的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于2D视频视觉关注度的3D重建方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员