We propose an optical flow-guided approach for semi-supervised video object segmentation. Optical flow is usually exploited as additional guidance information in unsupervised video object segmentation. However, its relevance in semi-supervised video object segmentation has not been fully explored. In this work, we follow an encoder-decoder approach to address the segmentation task. A model to extract the combined information from optical flow and the image is proposed, which is then used as input to the target model and the decoder network. Unlike previous methods where concatenation is used to integrate information from image data and optical flow, a simple yet effective attention mechanism is exploited in our work. Experiments on DAVIS 2017 and YouTube-VOS 2019 show that by integrating the information extracted from optical flow into the original image branch results in a strong performance gain and our method achieves state-of-the-art performance.
翻译:我们建议对半受监督的视频物体分割进行光学流导法。光学流动通常在未受监督的视频物体分割中作为补充指导信息加以利用。然而,它与半受监督的视频物体分割的相关性尚未得到充分探讨。在这项工作中,我们采用了编码器解码器解码器方法来处理分割任务。提出了从光学流和图像中提取综合信息的模型,然后用作目标模型和解码器网络的输入。与以前使用汇集将图像数据和光学流信息整合起来的方法不同,我们在工作中采用了简单而有效的注意机制。关于DAVIS 2017和YouTube-VOS 2019的实验表明,通过将光学流动中提取的信息纳入原始图像分支,将产生强大的性能收益,我们的方法将取得最新性能。