There hardly exists any large-scale datasets with dense optical flow of non-rigid motion from real-world imagery as of today. The reason lies mainly in the required setup to derive ground truth optical flows: a series of images with known camera poses along its trajectory, and an accurate 3D model from a textured scene. Human annotation is not only too tedious for large databases, it can simply hardly contribute to accurate optical flow. To circumvent the need for manual annotation, we propose a framework to automatically generate optical flow from real-world videos. The method extracts and matches objects from video frames to compute initial constraints, and applies a deformation over the objects of interest to obtain dense optical flow fields. We propose several ways to augment the optical flow variations. Extensive experimental results show that training on our automatically generated optical flow outperforms methods that are trained on rigid synthetic data using FlowNet-S, LiteFlowNet, PWC-Net, and RAFT. Datasets and implementation of our optical flow generation framework are released at https://github.com/lhoangan/arap_flow
翻译:从今天起,几乎不存在任何大型的数据集,从现实世界的图像中产生非硬性运动的密集光学流动,其原因主要在于为获得地面真理光学流动所需的设置:一系列图像,其已知的照相机沿其轨迹投影,以及一个纹理场的精确的3D模型。对于大型数据库来说,人类的注解不仅太乏味,而且几乎无法为准确的光学流动作出贡献。为避免人工注解的需要,我们提议了一个框架,自动从真实世界的视频中产生光学流。方法提取和匹配视频框中的物体,以计算初始限制,并对感兴趣的对象进行变形,以获得密集的光学流场。我们提出了加强光学流变化的若干方法。广泛的实验结果显示,关于我们自动生成的光流外形方法的培训,这些方法是使用流网-S、利特弗洛网、PWC-Net和RAFT进行硬合成数据培训。数据设置和我们光流生成框架的实施在 https://github.com/lhoangan/arap_trapt。