To apply optical flow in practice, it is often necessary to resize the input to smaller dimensions in order to reduce computational costs. However, downsizing inputs makes the estimation more challenging because objects and motion ranges become smaller. Even though recent approaches have demonstrated high-quality flow estimation, they tend to fail to accurately model small objects and precise boundaries when the input resolution is lowered, restricting their applicability to high-resolution inputs. In this paper, we introduce AnyFlow, a robust network that estimates accurate flow from images of various resolutions. By representing optical flow as a continuous coordinate-based representation, AnyFlow generates outputs at arbitrary scales from low-resolution inputs, demonstrating superior performance over prior works in capturing tiny objects with detail preservation on a wide range of scenes. We establish a new state-of-the-art performance of cross-dataset generalization on the KITTI dataset, while achieving comparable accuracy on the online benchmarks to other SOTA methods.
翻译:为了在实践中应用光流,通常需要将输入缩小到更小的尺寸以降低计算成本。然而,缩小输入使估计变得更具挑战性,因为对象和运动范围变得更小。尽管最近的方法已经证明了高质量的光流估计,但当输入分辨率降低时,它们往往无法准确地建模小物体和精确边界,限制了它们对高分辨率输入的适用性。在本文中,我们介绍了 AnyFlow,这是一个强大的网络,可以从各种分辨率的图像中估计准确的流。通过将光流表示为连续的基于坐标的表达,AnyFlow可以从低分辨率输入生成任意比例的输出,在捕获广泛场景上的细节时具有优越的性能,展示了比先前作品更好的性能。我们在 KITTI 数据集上建立了新的跨数据集泛化的最新性能记录,同时在在线基准测试中实现与其他 SOTA 方法相当的准确性。