弱监督的时间卷积神经网络用于细粒度手术活动识别 (Weakly Supervised Temporal Convolutional Networks for Fine-grained Surgical Activity Recognition) - 专知论文

会员服务 ·

0

活动识别 · 细粒度 · 识别 · 粒度 · 注释（编程） ·

2023 年 4 月 11 日

Weakly Supervised Temporal Convolutional Networks for Fine-grained Surgical Activity Recognition

翻译：弱监督的时间卷积神经网络用于细粒度手术活动识别

Sanat Ramesh,Diego Dall'Alba,Cristians Gonzalez,Tong Yu,Pietro Mascagni,Didier Mutter,Jacques Marescaux,Paolo Fiorini,Nicolas Padoy

Automatic recognition of fine-grained surgical activities, called steps, is a challenging but crucial task for intelligent intra-operative computer assistance. The development of current vision-based activity recognition methods relies heavily on a high volume of manually annotated data. This data is difficult and time-consuming to generate and requires domain-specific knowledge. In this work, we propose to use coarser and easier-to-annotate activity labels, namely phases, as weak supervision to learn step recognition with fewer step annotated videos. We introduce a step-phase dependency loss to exploit the weak supervision signal. We then employ a Single-Stage Temporal Convolutional Network (SS-TCN) with a ResNet-50 backbone, trained in an end-to-end fashion from weakly annotated videos, for temporal activity segmentation and recognition. We extensively evaluate and show the effectiveness of the proposed method on a large video dataset consisting of 40 laparoscopic gastric bypass procedures and the public benchmark CATARACTS containing 50 cataract surgeries.

翻译：自动识别细粒度手术活动（称为步骤）是智能术中计算机辅助的具有挑战性但至关重要的任务。当前基于视觉的活动识别方法的发展严重依赖于大量手动注释的数据。这些数据很难制备且耗时，并需要特定领域的知识。在本文中，我们提出使用较粗糙且易于注释的活动标签（即阶段）作为弱监督，以较少的步骤注释视频学习步骤识别。我们引入了一种步骤 - 阶段依存“损失”，以利用弱监督信号。然后，我们使用具有ResNet-50主干的单级时间卷积神经网络（SS-TCN）在弱标注视频上以端对端方式进行训练，用于时间活动分割和识别。我们对由40个腹腔镜胃旁路手术和包含50个白内障手术的公共基准CATARACTS组成的大型视频数据集进行了广泛的评估，并展示了所提出的方法的有效性。

0

相关内容

活动识别

近期必读的5篇顶会CVPR 2021【视觉目标跟踪】相关论文和代码

专知会员服务

37+阅读 · 2021年3月23日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

专知会员服务

78+阅读 · 2020年2月25日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

专知会员服务

30+阅读 · 2020年1月2日

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

专知会员服务

12+阅读 · 2019年11月15日

【用十亿级半监督学习实现最先进图像与视频分类】《Billion-scale semi-supervised learning for state-of-the-art image and video classification | Facebook》

【用十亿级半监督学习实现最先进图像与视频分类】《Billion-scale semi-supervised learning for state-of-the-art image and video classification | Facebook》

专知会员服务

16+阅读 · 2019年10月21日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

AAAI2020 图相关论文集

AAAI2020 图相关论文集

图与推荐

11+阅读 · 2020年7月15日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

专知

22+阅读 · 2018年4月21日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文推荐】最新六篇视频分类相关论文—层次标签推断、知识图谱、CNNs、DAiSEE、表观和关系网络、转移学习

【论文推荐】最新六篇视频分类相关论文—层次标签推断、知识图谱、CNNs、DAiSEE、表观和关系网络、转移学习

专知

13+阅读 · 2018年2月18日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

基于CP-OFDM发射波形的目标检测算法研究

国家自然科学基金

2+阅读 · 2015年12月31日

新型Plectin-1荧光、MRI靶向分子探针对胰腺癌早期诊断的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

TMOD1调节actin聚合影响胰岛素信号转导的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

自适应学习的多摄像机目标跟踪

国家自然科学基金

1+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

H-半变分不等式的非线性扰动与分数阶问题

国家自然科学基金

0+阅读 · 2012年12月31日

肝癌细胞膜蛋白cytokeratin-1用于肝癌在体分子显像和靶向治疗的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

癌痛消方对大鼠肝癌模型细胞凋亡信号传导的调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于Surfacelet多尺度积的三维SAR图像去噪与分割

国家自然科学基金

0+阅读 · 2009年12月31日

钢-混凝土组合结构抗剪连接高应变疲劳机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal Action Localization

Arxiv

0+阅读 · 2023年5月29日

MVP: Multi-task Supervised Pre-training for Natural Language Generation

Arxiv

0+阅读 · 2023年5月28日

S4M: Generating Radiology Reports by A Single Model for Multiple Body Parts

Arxiv

0+阅读 · 2023年5月26日

A Hybrid Neural Coding Approach for Pattern Recognition with Spiking Neural Networks

Arxiv

0+阅读 · 2023年5月26日

Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition

Arxiv

0+阅读 · 2023年5月25日

All Points Matter: Entropy-Regularized Distribution Alignment for Weakly-supervised 3D Segmentation

Arxiv

0+阅读 · 2023年5月25日

Multimodal Prompting with Missing Modalities for Visual Recognition

Arxiv

11+阅读 · 2023年3月6日

Knowledge Embedding Based Graph Convolutional Network

Knowledge Embedding Based Graph Convolutional Network

Arxiv

24+阅读 · 2021年4月23日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

VIP会员

文章信息

相关主题

注释（编程）

相关VIP内容

近期必读的5篇顶会CVPR 2021【视觉目标跟踪】相关论文和代码

专知会员服务

37+阅读 · 2021年3月23日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

专知会员服务

78+阅读 · 2020年2月25日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

专知会员服务

30+阅读 · 2020年1月2日

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

专知会员服务

12+阅读 · 2019年11月15日

【用十亿级半监督学习实现最先进图像与视频分类】《Billion-scale semi-supervised learning for state-of-the-art image and video classification | Facebook》

【用十亿级半监督学习实现最先进图像与视频分类】《Billion-scale semi-supervised learning for state-of-the-art image and video classification | Facebook》

专知会员服务

16+阅读 · 2019年10月21日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型幻觉：系统综述

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

【博士论文】数据与任务的物理学：深度学习中的局部性与组合性理论

代理式人工智能时代的决策优势

相关资讯

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

AAAI2020 图相关论文集

AAAI2020 图相关论文集

图与推荐

11+阅读 · 2020年7月15日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

专知

22+阅读 · 2018年4月21日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文推荐】最新六篇视频分类相关论文—层次标签推断、知识图谱、CNNs、DAiSEE、表观和关系网络、转移学习

【论文推荐】最新六篇视频分类相关论文—层次标签推断、知识图谱、CNNs、DAiSEE、表观和关系网络、转移学习

专知

13+阅读 · 2018年2月18日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal Action Localization

Arxiv

0+阅读 · 2023年5月29日

MVP: Multi-task Supervised Pre-training for Natural Language Generation

Arxiv

0+阅读 · 2023年5月28日

S4M: Generating Radiology Reports by A Single Model for Multiple Body Parts

Arxiv

0+阅读 · 2023年5月26日

A Hybrid Neural Coding Approach for Pattern Recognition with Spiking Neural Networks

Arxiv

0+阅读 · 2023年5月26日

Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition

Arxiv

0+阅读 · 2023年5月25日

All Points Matter: Entropy-Regularized Distribution Alignment for Weakly-supervised 3D Segmentation

Arxiv

0+阅读 · 2023年5月25日

Multimodal Prompting with Missing Modalities for Visual Recognition

Arxiv

11+阅读 · 2023年3月6日

Knowledge Embedding Based Graph Convolutional Network

Knowledge Embedding Based Graph Convolutional Network

Arxiv

24+阅读 · 2021年4月23日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

相关基金

基于CP-OFDM发射波形的目标检测算法研究

国家自然科学基金

2+阅读 · 2015年12月31日

新型Plectin-1荧光、MRI靶向分子探针对胰腺癌早期诊断的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

TMOD1调节actin聚合影响胰岛素信号转导的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

自适应学习的多摄像机目标跟踪

国家自然科学基金

1+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

H-半变分不等式的非线性扰动与分数阶问题

国家自然科学基金

0+阅读 · 2012年12月31日

肝癌细胞膜蛋白cytokeratin-1用于肝癌在体分子显像和靶向治疗的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

癌痛消方对大鼠肝癌模型细胞凋亡信号传导的调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于Surfacelet多尺度积的三维SAR图像去噪与分割

国家自然科学基金

0+阅读 · 2009年12月31日

钢-混凝土组合结构抗剪连接高应变疲劳机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员