We propose "factor matting", an alternative formulation of the video matting problem in terms of counterfactual video synthesis that is better suited for re-composition tasks. The goal of factor matting is to separate the contents of video into independent components, each visualizing a counterfactual version of the scene where contents of other components have been removed. We show that factor matting maps well to a more general Bayesian framing of the matting problem that accounts for complex conditional interactions between layers. Based on this observation, we present a method for solving the factor matting problem that produces useful decompositions even for video with complex cross-layer interactions like splashes, shadows, and reflections. Our method is trained per-video and requires neither pre-training on external large datasets, nor knowledge about the 3D structure of the scene. We conduct extensive experiments, and show that our method not only can disentangle scenes with complex interactions, but also outperforms top methods on existing tasks such as classical video matting and background subtraction. In addition, we demonstrate the benefits of our approach on a range of downstream tasks. Please refer to our project webpage for more details: https://factormatte.github.io
翻译:我们提出“因素交配”这个视频交配问题的替代提法,这个提法更适合重新组合任务。要素交配的目的是将视频内容分为独立部件,每个图像交配的目的是将视频内容分为独立部件,每个视频的反事实版本都可视化其他组件内容已经删除的场景的反事实版本。我们展示了该因子交配地图非常适合一个更普遍的巴伊西亚方位的配交配问题框架,其中考虑到各层之间复杂的有条件互动。根据这一观察,我们提出了一个解决因子交配问题的方法,它产生有用的分解功能,甚至对于视频具有复杂的跨层互动,如浮点、阴影和反射。我们的方法是经过每部视频培训的,不需要对外部大型数据集进行预先培训,也不需要对场景的3D结构进行知识。我们进行了广泛的实验,并表明我们的方法不仅能够将场景与复杂的互动分开,而且超越现有任务(如古典视频交配和背景减缩等)的顶端方法。此外,我们还展示了我们在下游任务范围上的做法的好处。请参见我们的项目http://http网页。请查阅。