Batch Normalization在端到端视频学习中的陷阱：基于外科手术工作流分析的研究 (On the Pitfalls of Batch Normalization for End-to-End Video Learning: A Study on Surgical Workflow Analysis) - 专知论文

会员服务 ·

0

批量规范化 · Learning · 端到端 · Analysis · 规范化的 ·

2023 年 3 月 16 日

On the Pitfalls of Batch Normalization for End-to-End Video Learning: A Study on Surgical Workflow Analysis

翻译：Batch Normalization在端到端视频学习中的陷阱：基于外科手术工作流分析的研究

Dominik Rivoir,Isabel Funke,Stefanie Speidel

Batch Normalization's (BN) unique property of depending on other samples in a batch is known to cause problems in several tasks, including sequential modeling. Yet, BN-related issues are hardly studied for long video understanding, despite the ubiquitous use of BN in CNNs for feature extraction. Especially in surgical workflow analysis, where the lack of pretrained feature extractors has lead to complex, multi-stage training pipelines, limited awareness of BN issues may have hidden the benefits of training CNNs and temporal models end to end. In this paper, we %present and analyze known as well as novel pitfalls of BN in video learning, including issues specific to online tasks such as a 'cheating' effect in anticipation. We observe that BN's properties create major obstacles for end-to-end learning. However, using BN-free backbones, even simple CNN-LSTMs beat state of the art in two surgical tasks by utilizing adequate end-to-end training strategies which maximize temporal context. We conclude that awareness of BN's pitfalls is crucial for effective end-to-end learning in surgical tasks. By reproducing results on natural-video datasets, we hope our insights will benefit other areas of video learning as well. Code: \url{https://gitlab.com/nct_tso_public/pitfalls_bn}.

翻译：Batch Normalization（BN）依赖于批次中的其他样本的独特属性已知会在多个任务中导致问题，包括顺序建模。然而，对于长视频理解，尽管在CNN提取特征中普遍使用BN，但BN相关问题很少被研究。特别是在外科手术工作流分析中，预训练的特征提取器的缺乏导致了复杂的多阶段训练管道，BN问题的有限认识可能隐藏了训练CNN和时间模型的端到端学习的好处。本文中，我们提出并分析了BN在视频学习中已知及新的陷阱，包括特定于在线任务的问题，例如预测中的“作弊”效应。我们观察到，BN的属性为端到端学习创造了重大障碍。然而，使用无BN的骨干网，即使是简单的CNN-LSTM也能通过利用适当的端到端训练策略（最大化时序上下文）在两个外科任务中打败最先进的技术。我们得出结论，意识到BN的陷阱对于外科任务的有效端到端学习至关重要。通过在自然视频数据集上重现结果，我们希望我们的洞见也能造福于其他视频学习领域。代码：\url{https://gitlab.com/nct_tso_public/pitfalls_bn}。

0

相关内容

批量规范化

批量规范化

【深度迁移学习在图像分类中的应用综述】Deep transfer learning for image classification: a survey

【深度迁移学习在图像分类中的应用综述】Deep transfer learning for image classification: a survey

专知会员服务

25+阅读 · 2022年5月24日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【硬核书】树与网络上的概率，716页pdf

【硬核书】树与网络上的概率，716页pdf

专知会员服务

77+阅读 · 2021年12月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

视频自监督学习综述

视频自监督学习综述

专知

1+阅读 · 2022年7月5日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

机器学习研究会

12+阅读 · 2018年1月27日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

糖基化终末产物在糖尿病性角膜上皮病变中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于视觉注意机制的SAR图像小目标检测方法研究

国家自然科学基金

4+阅读 · 2013年12月31日

复合材料结构全信息健康监测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

8种高危HPV型别的E6癌蛋白对HPV感染和轻度子宫颈癌前病变预后的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

图嵌入方法及其在网络虚拟化中应用研究

国家自然科学基金

2+阅读 · 2012年12月31日

湿热环境下CFRP加固受损钢结构的疲劳裂纹扩展规律研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

疏肝益肾方抗乳腺癌内分泌治疗耐药的增效机制

国家自然科学基金

0+阅读 · 2011年12月31日

病理性近视易感基因研究

国家自然科学基金

0+阅读 · 2009年12月31日

On Preimage Approximation for Neural Networks

Arxiv

0+阅读 · 2023年5月8日

Risk Set Matched Difference-in-Differences for the Analysis of Effect Modification in an Observational Study on the Impact of Gun Violence on Health Outcomes

Arxiv

0+阅读 · 2023年5月6日

Composite Motion Learning with Task Control

Arxiv

0+阅读 · 2023年5月5日

Full Stack Optimization of Transformer Inference: a Survey

Arxiv

19+阅读 · 2023年2月27日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Bayesian Deep Learning via Subnetwork Inference

Arxiv

10+阅读 · 2021年2月18日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

VIP会员

文章信息

相关主题

批量规范化

相关VIP内容

【深度迁移学习在图像分类中的应用综述】Deep transfer learning for image classification: a survey

【深度迁移学习在图像分类中的应用综述】Deep transfer learning for image classification: a survey

专知会员服务

25+阅读 · 2022年5月24日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【硬核书】树与网络上的概率，716页pdf

【硬核书】树与网络上的概率，716页pdf

专知会员服务

77+阅读 · 2021年12月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

视频自监督学习综述

视频自监督学习综述

专知

1+阅读 · 2022年7月5日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

机器学习研究会

12+阅读 · 2018年1月27日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

On Preimage Approximation for Neural Networks

Arxiv

0+阅读 · 2023年5月8日

Risk Set Matched Difference-in-Differences for the Analysis of Effect Modification in an Observational Study on the Impact of Gun Violence on Health Outcomes

Arxiv

0+阅读 · 2023年5月6日

Composite Motion Learning with Task Control

Arxiv

0+阅读 · 2023年5月5日

Full Stack Optimization of Transformer Inference: a Survey

Arxiv

19+阅读 · 2023年2月27日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Bayesian Deep Learning via Subnetwork Inference

Arxiv

10+阅读 · 2021年2月18日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

相关基金

糖基化终末产物在糖尿病性角膜上皮病变中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于视觉注意机制的SAR图像小目标检测方法研究

国家自然科学基金

4+阅读 · 2013年12月31日

复合材料结构全信息健康监测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

8种高危HPV型别的E6癌蛋白对HPV感染和轻度子宫颈癌前病变预后的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

图嵌入方法及其在网络虚拟化中应用研究

国家自然科学基金

2+阅读 · 2012年12月31日

湿热环境下CFRP加固受损钢结构的疲劳裂纹扩展规律研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

疏肝益肾方抗乳腺癌内分泌治疗耐药的增效机制

国家自然科学基金

0+阅读 · 2011年12月31日

病理性近视易感基因研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员