通过动议分组自行监督的视频对象分割 (Self-supervised Video Object Segmentation by Motion Grouping)

Animals have evolved highly functional visual systems to understand motion, assisting perception even under complex environments. In this paper, we work towards developing a computer vision system able to segment objects by exploiting motion cues, i.e. motion segmentation. We make the following contributions: First, we introduce a simple variant of the Transformer to segment optical flow frames into primary objects and the background. Second, we train the architecture in a self-supervised manner, i.e. without using any manual annotations. Third, we analyze several critical components of our method and conduct thorough ablation studies to validate their necessity. Fourth, we evaluate the proposed architecture on public benchmarks (DAVIS2016, SegTrackv2, and FBMS59). Despite using only optical flow as input, our approach achieves superior or comparable results to previous state-of-the-art self-supervised methods, while being an order of magnitude faster. We additionally evaluate on a challenging camouflage dataset (MoCA), significantly outperforming the other self-supervised approaches, and comparing favourably to the top supervised approach, highlighting the importance of motion cues, and the potential bias towards visual appearance in existing video segmentation models.

翻译：动物已经发展了高度功能化的视觉系统来理解运动,帮助人们在复杂的环境中也能理解运动。在本文中,我们致力于开发一个计算机视觉系统,能够通过运动提示(即运动分割)来分割物体。我们做出了以下贡献:首先,我们引入了一种简单的变异变体,将光学流动框架分割到主要对象和背景中。第二,我们以自我监督的方式,即不使用任何手动说明来培训建筑。第三,我们分析了我们方法的若干关键组成部分,并进行了彻底的对比研究,以验证其必要性。第四,我们评估了拟议的公共基准结构(DAVIS2016、SegTracrackv2和FBMS59)。尽管我们只使用光学流作为投入,但我们的方法取得了优异或可比的结果,与以前的最先进的自我监督方法相比,而其规模则更快。我们进一步评估了具有挑战性的迷彩数据集(MCA),大大优于其他自我监督的方法,并比较了顶级方法,强调了运动提示的重要性,以及现有视频路段模型中对视觉外观的潜在偏差。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

【UC伯克利】自监督视觉表示学习，356页ppt，Self-Supervised Visual Learning

专知会员服务

66+阅读 · 2021年1月10日

Python计算导论，560页pdf，Introduction to Computing Using Python

专知会员服务

76+阅读 · 2020年5月5日

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

专知会员服务

7+阅读 · 2020年4月16日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日