Video object segmentation (VOS) describes the task of segmenting a set of objects in each frame of a video. In the semi-supervised setting, the first mask of each object is provided at test time. Following the one-shot principle, fine-tuning VOS methods train a segmentation model separately on each given object mask. However, recently the VOS community has deemed such a test time optimization and its impact on the test runtime as unfeasible. To mitigate the inefficiencies of previous fine-tuning approaches, we present efficient One-Shot Video Object Segmentation (e-OSVOS). In contrast to most VOS approaches, e-OSVOS decouples the object detection task and predicts only local segmentation masks by applying a modified version of Mask R-CNN. The one-shot test runtime and performance are optimized without a laborious and handcrafted hyperparameter search. To this end, we meta learn the model initialization and learning rates for the test time optimization. To achieve optimal learning behavior, we predict individual learning rates at a neuron level. Furthermore, we apply an online adaptation to address the common performance degradation throughout a sequence by continuously fine-tuning the model on previous mask predictions supported by a frame-to-frame bounding box propagation. e-OSVOS provides state-of-the-art results on DAVIS 2016, DAVIS 2017, and YouTube-VOS for one-shot fine-tuning methods while reducing the test runtime substantially. Code is available at https://github.com/dvl-tum/e-osvos.


翻译:视频对象分割( VOS) 描述在视频的每个框中分割一组对象的任务 。 在半监督的设置中, 每个对象的第一个掩码是在测试时提供的 。 按照一发原则, 微调 VOS 方法将每个对象掩码的分解模型单独培训。 但是, 最近VOS 社区认为这种测试时间优化及其对测试运行时间的影响是行不通的。 为了减轻先前微调方法的效率, 我们展示了高效的 One- Shot 视频对象偏移(e- OSVOS ) 。 与大多数 VOS 方法相比, e- OSVOS 解析了对象检测任务, 并预测只有本地的分解掩码 。 按照单发原则, 微调 VOS 方法在应用一个修改版的 MMS R- CNN 。 单发测试时间和性能在不费力和手工制作的超度参数搜索中被优化。 为此, 我们从元中学习模型初始初始化和学习率。 为了实现最佳的学习行为, 我们预测个人在神经级水平上学习率水平上的个人学习率。 此外, 我们应用电子OVO- 正在应用一个在线修正的S- 格式的S- 测试模型来调整整个序列中S- streal- for- for- for- for- for- for- for- for- s- for- s- deal- s- s- for- s- s- s- s- for- sal- sal- res- salvialvialvialvialvialvial- res- orvial- or- or- or- res- res- or- res- res- res- lavi- or- or- or- res- or- res- res- res- res- lab- s- res- lavialbal- labal- labal- res- res- res- res- res- res- res- res- s- res- s- s- s- res- res- res- res- res- res- res- res-

0
下载
关闭预览

相关内容

【论文推荐】小样本视频合成,Few-shot Video-to-Video Synthesis
专知会员服务
23+阅读 · 2019年12月15日
已删除
将门创投
7+阅读 · 2020年3月13日
Unsupervised Learning via Meta-Learning
CreateAMind
42+阅读 · 2019年1月3日
TensorMask: A Foundation for Dense Object Segmentation
Arxiv
10+阅读 · 2019年3月28日
Arxiv
7+阅读 · 2018年12月5日
Arxiv
6+阅读 · 2018年3月29日
VIP会员
相关VIP内容
【论文推荐】小样本视频合成,Few-shot Video-to-Video Synthesis
专知会员服务
23+阅读 · 2019年12月15日
相关资讯
已删除
将门创投
7+阅读 · 2020年3月13日
Unsupervised Learning via Meta-Learning
CreateAMind
42+阅读 · 2019年1月3日
Top
微信扫码咨询专知VIP会员