OS-MSL:一个阶段的场段分割和分类多式联线框架 (OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification) - 专知论文

会员服务 ·

0

多峰值 · Learning · Extensibility · Analysis · INFORMS ·

2022 年 7 月 4 日

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification

翻译：OS-MSL:一个阶段的场段分割和分类多式联线框架

Ye Liu,Lingfeng Qiao,Di Yin,Zhuoxuan Jiang,Xinghua Jiang,Deqiang Jiang,Bo Ren

from arxiv, Accepted by ACM MM 2022

Scene segmentation and classification (SSC) serve as a critical step towards the field of video structuring analysis. Intuitively, jointly learning of these two tasks can promote each other by sharing common information. However, scene segmentation concerns more on the local difference between adjacent shots while classification needs the global representation of scene segments, which probably leads to the model dominated by one of the two tasks in the training phase. In this paper, from an alternate perspective to overcome the above challenges, we unite these two tasks into one task by a new form of predicting shots link: a link connects two adjacent shots, indicating that they belong to the same scene or category. To the end, we propose a general One Stage Multimodal Sequential Link Framework (OS-MSL) to both distinguish and leverage the two-fold semantics by reforming the two learning tasks into a unified one. Furthermore, we tailor a specific module called DiffCorrNet to explicitly extract the information of differences and correlations among shots. Extensive experiments on a brand-new large scale dataset collected from real-world applications, and MovieScenes are conducted. Both the results demonstrate the effectiveness of our proposed method against strong baselines.

翻译：视频结构分析(SSC)是向视频结构分析领域迈出的关键一步。诚然, 共同学习这两个任务可以通过共享共同信息来相互促进。然而, 场面分解更多地关注相邻镜头之间的局部差异, 而分类则需要全球显示场景区段, 这可能导致在培训阶段以两种任务之一为主的模式。在本文中, 从另一个角度来克服上述挑战, 我们用一种新的预测镜头链接形式将这两项任务合并成一项任务: 连接两个相邻镜头, 表明它们属于同一场景或类别。最后, 我们提议了一个通用的“ 一个阶段多式序列链接框架 ” ( OS- MSL ), 以区分并利用两重的语义, 将两个学习任务改成一个统一的。此外, 我们设计了一个名为 DiffCorrNet 的具体模块, 以明确提取不同镜头和相关性的信息。在从真实世界应用中收集的新规模的大型数据集上进行广泛的实验, 以及MovieSceneress, 这两项结果都展示了我们所提议的方法相对于强势基准的有效性。

0

相关内容

多峰值

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

复杂交互型网络系统的DEA效率评价与资源配置研究及应用

国家自然科学基金

0+阅读 · 2015年12月31日

OSMR在糖尿病心肌病中的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

战略环境评价有效性评估指标体系与评估方法研究

国家自然科学基金

10+阅读 · 2013年12月31日

转录因子Egr2/Egr3在类风湿关节炎寒证中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

三维流形Heegaard分解稳定化问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

雷帕霉素不敏感mTOR伴随蛋白（Rictor）调节胰岛β细胞功能的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

矿山微震波自动识别的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于多Agent系统的流域防洪智能调度研究

国家自然科学基金

0+阅读 · 2011年12月31日

多视角下的多类型目标识别与行为分析

国家自然科学基金

2+阅读 · 2011年12月31日

Cats: Complementary CNN and Transformer Encoders for Segmentation

Cats: Complementary CNN and Transformer Encoders for Segmentation

Arxiv

0+阅读 · 2022年8月24日

Fast and Precise Binary Instance Segmentation of 2D Objects for Automotive Applications

Arxiv

0+阅读 · 2022年8月24日

DualSmoke: Sketch-Based Smoke Illustration Design with Two-Stage Generative Model

Arxiv

0+阅读 · 2022年8月23日

Actor and Action Modular Network for Text-based Video Segmentation

Arxiv

0+阅读 · 2022年8月22日

Aspect-based Sentiment Classification with Sequential Cross-modal Semantic Graph

Arxiv

0+阅读 · 2022年8月19日

Nuclei instance segmentation and classification in histopathology images with StarDist

Arxiv

0+阅读 · 2022年8月19日

EAA-Net: Rethinking the Autoencoder Architecture with Intra-class Features for Medical Image Segmentation

Arxiv

0+阅读 · 2022年8月19日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

VIP会员

文章信息

相关主题

相关VIP内容

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Cats: Complementary CNN and Transformer Encoders for Segmentation

Cats: Complementary CNN and Transformer Encoders for Segmentation

Arxiv

0+阅读 · 2022年8月24日

Fast and Precise Binary Instance Segmentation of 2D Objects for Automotive Applications

Arxiv

0+阅读 · 2022年8月24日

DualSmoke: Sketch-Based Smoke Illustration Design with Two-Stage Generative Model

Arxiv

0+阅读 · 2022年8月23日

Actor and Action Modular Network for Text-based Video Segmentation

Arxiv

0+阅读 · 2022年8月22日

Aspect-based Sentiment Classification with Sequential Cross-modal Semantic Graph

Arxiv

0+阅读 · 2022年8月19日

Nuclei instance segmentation and classification in histopathology images with StarDist

Arxiv

0+阅读 · 2022年8月19日

EAA-Net: Rethinking the Autoencoder Architecture with Intra-class Features for Medical Image Segmentation

Arxiv

0+阅读 · 2022年8月19日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

相关基金

复杂交互型网络系统的DEA效率评价与资源配置研究及应用

国家自然科学基金

0+阅读 · 2015年12月31日

OSMR在糖尿病心肌病中的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

战略环境评价有效性评估指标体系与评估方法研究

国家自然科学基金

10+阅读 · 2013年12月31日

转录因子Egr2/Egr3在类风湿关节炎寒证中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

三维流形Heegaard分解稳定化问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

雷帕霉素不敏感mTOR伴随蛋白（Rictor）调节胰岛β细胞功能的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

矿山微震波自动识别的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于多Agent系统的流域防洪智能调度研究

国家自然科学基金

0+阅读 · 2011年12月31日

多视角下的多类型目标识别与行为分析

国家自然科学基金

2+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员