State-of-the-art frame interpolation methods generate intermediate frames by inferring object motions in the image from consecutive key-frames. In the absence of additional information, first-order approximations, i.e. optical flow, must be used, but this choice restricts the types of motions that can be modeled, leading to errors in highly dynamic scenarios. Event cameras are novel sensors that address this limitation by providing auxiliary visual information in the blind-time between frames. They asynchronously measure per-pixel brightness changes and do this with high temporal resolution and low latency. Event-based frame interpolation methods typically adopt a synthesis-based approach, where predicted frame residuals are directly applied to the key-frames. However, while these approaches can capture non-linear motions they suffer from ghosting and perform poorly in low-texture regions with few events. Thus, synthesis-based and flow-based approaches are complementary. In this work, we introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both. We extensively evaluate our method on three synthetic and two real benchmarks where we show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods. Finally, we release a new large-scale dataset in highly dynamic scenarios, aimed at pushing the limits of existing methods.
翻译:在缺少额外信息的情况下,必须使用第一阶近似值,即光学流,但这种选择限制了可以模拟的动议类型,导致动态性强的情景出现错误。事件相机是新颖的传感器,通过在两边框架之间盲眼时提供辅助视觉信息来应对这一限制。它们不同步地测量每平方光度变化,并以高时间分辨率和低潜度来进行。基于事件的框架内插法通常采用基于综合的方法,预测的框架剩余值直接应用于关键框架。不过,虽然这些方法可以捕捉非线性动议,在低潮地区造成幻影和运行不良,但很少发生事件。因此,基于合成和流动的方法是相辅相成的。在这项工作中,我们介绍了时间镜头,一种新颖的显示利用两者优势的同等贡献方法。我们广泛评价了三种基于综合和两种实际基准的方法,其中预测的框架剩余值直接用于关键框架。 然而,这些方法可以捕捉非线性运动运动运动运动运动,在低潮区区域中进行不良的演练。因此,基于合成的和流基方法是利用这两种方法的优势。我们从三种合成和两种实际基准,在5级的动态基准上展示了我们以高动态基准的动态模型的大规模改进。