DVIS: Decoupled Video Instance Segmentation Framework - 专知论文

会员服务 ·

0

HTTPS · 示例 · SOTA · 相互独立的 · INFORMS ·

2023 年 6 月 8 日

DVIS: Decoupled Video Instance Segmentation Framework

翻译：暂无翻译

Tao Zhang,Xingye Tian,Yu Wu,Shunping Ji,Xuebo Wang,Yuan Zhang,Pengfei Wan

Video instance segmentation (VIS) is a critical task with diverse applications, including autonomous driving and video editing. Existing methods often underperform on complex and long videos in real world, primarily due to two factors. Firstly, offline methods are limited by the tightly-coupled modeling paradigm, which treats all frames equally and disregards the interdependencies between adjacent frames. Consequently, this leads to the introduction of excessive noise during long-term temporal alignment. Secondly, online methods suffer from inadequate utilization of temporal information. To tackle these challenges, we propose a decoupling strategy for VIS by dividing it into three independent sub-tasks: segmentation, tracking, and refinement. The efficacy of the decoupling strategy relies on two crucial elements: 1) attaining precise long-term alignment outcomes via frame-by-frame association during tracking, and 2) the effective utilization of temporal information predicated on the aforementioned accurate alignment outcomes during refinement. We introduce a novel referring tracker and temporal refiner to construct the \textbf{D}ecoupled \textbf{VIS} framework (\textbf{DVIS}). DVIS achieves new SOTA performance in both VIS and VPS, surpassing the current SOTA methods by 7.3 AP and 9.6 VPQ on the OVIS and VIPSeg datasets, which are the most challenging and realistic benchmarks. Moreover, thanks to the decoupling strategy, the referring tracker and temporal refiner are super light-weight (only 1.69\% of the segmenter FLOPs), allowing for efficient training and inference on a single GPU with 11G memory. The code is available at \href{https://github.com/zhang-tao-whu/DVIS}{https://github.com/zhang-tao-whu/DVIS}.

翻译：暂无翻译

0

相关内容

HTTPS

超文本传输安全协议是超文本传输协议和 SSL/TLS 的组合，用以提供加密通讯及对网络服务器身份的鉴定。

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

语义分割+视频分割开源代码集合

语义分割+视频分割开源代码集合

极市平台

35+阅读 · 2018年3月5日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

Criegee自由基大气化学反应机理与二次有机气溶胶形成

国家自然科学基金

0+阅读 · 2014年12月31日

稀土簇及稀土与过渡金属簇-有机骨架的构筑及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

过渡金属簇共轭桥联稀土近红外发光功能配合物的组装、结构与性能

国家自然科学基金

0+阅读 · 2012年12月31日

MOF/CNT/CTA表界面结构调控及复杂气体吸附机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型高效钌(II)配合物有机无机杂化氧气传感纳米纤维的静电纺丝制备及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

钙离子对有机物形成正渗透膜污染的影响机制及调控

国家自然科学基金

0+阅读 · 2012年12月31日

基于稀土离子f-d跃迁的FED新发光材料的设计、合成和光谱性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

氮杂双环手性有序介孔有机硅催化剂的设计、合成及催化反应研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于柱芳烃有机纳米管的组装合成及性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

苯乙烯-异戊二烯（或丁二烯）嵌段共聚物表面最外层结构的形成及其影响因素

国家自然科学基金

0+阅读 · 2008年12月31日

DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation

Arxiv

0+阅读 · 2023年7月31日

Towards Unbalanced Motion: Part-Decoupling Network for Video Portrait Segmentation

Arxiv

0+阅读 · 2023年7月31日

Object-Centric Video Prediction via Decoupling of Object Dynamics and Interactions

Arxiv

0+阅读 · 2023年7月31日

Fast Full-frame Video Stabilization with Iterative Optimization

Arxiv

0+阅读 · 2023年7月31日

XMem++: Production-level Video Segmentation From Few Annotated Frames

Arxiv

0+阅读 · 2023年7月29日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation

A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation

Arxiv

15+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

相互独立的

相关VIP内容

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型智能体强化学习：全景综述

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

【伯克利博士论文】从推理服务到训练：面向大规模 LLM 智能体的高效系统

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

语义分割+视频分割开源代码集合

语义分割+视频分割开源代码集合

极市平台

35+阅读 · 2018年3月5日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation

Arxiv

0+阅读 · 2023年7月31日

Towards Unbalanced Motion: Part-Decoupling Network for Video Portrait Segmentation

Arxiv

0+阅读 · 2023年7月31日

Object-Centric Video Prediction via Decoupling of Object Dynamics and Interactions

Arxiv

0+阅读 · 2023年7月31日

Fast Full-frame Video Stabilization with Iterative Optimization

Arxiv

0+阅读 · 2023年7月31日

XMem++: Production-level Video Segmentation From Few Annotated Frames

Arxiv

0+阅读 · 2023年7月29日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation

A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation

Arxiv

15+阅读 · 2018年8月2日

相关基金

Criegee自由基大气化学反应机理与二次有机气溶胶形成

国家自然科学基金

0+阅读 · 2014年12月31日

稀土簇及稀土与过渡金属簇-有机骨架的构筑及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

过渡金属簇共轭桥联稀土近红外发光功能配合物的组装、结构与性能

国家自然科学基金

0+阅读 · 2012年12月31日

MOF/CNT/CTA表界面结构调控及复杂气体吸附机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型高效钌(II)配合物有机无机杂化氧气传感纳米纤维的静电纺丝制备及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

钙离子对有机物形成正渗透膜污染的影响机制及调控

国家自然科学基金

0+阅读 · 2012年12月31日

基于稀土离子f-d跃迁的FED新发光材料的设计、合成和光谱性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

氮杂双环手性有序介孔有机硅催化剂的设计、合成及催化反应研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于柱芳烃有机纳米管的组装合成及性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

苯乙烯-异戊二烯（或丁二烯）嵌段共聚物表面最外层结构的形成及其影响因素

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员