Video frame interpolation (VFI) enables many important applications that might involve the temporal domain, such as slow motion playback, or the spatial domain, such as stop motion sequences. We are focusing on the former task, where one of the key challenges is handling high dynamic range (HDR) scenes in the presence of complex motion. To this end, we explore possible advantages of dual-exposure sensors that readily provide sharp short and blurry long exposures that are spatially registered and whose ends are temporally aligned. This way, motion blur registers temporally continuous information on the scene motion that, combined with the sharp reference, enables more precise motion sampling within a single camera shot. We demonstrate that this facilitates a more complex motion reconstruction in the VFI task, as well as HDR frame reconstruction that so far has been considered only for the originally captured frames, not in-between interpolated frames. We design a neural network trained in these tasks that clearly outperforms existing solutions. We also propose a metric for scene motion complexity that provides important insights into the performance of VFI methods at the test time.
翻译:视频框架内插( VFI) 使许多可能涉及时空域的重要应用, 如慢动作回放或空间域, 如停止运动序列。 我们正集中关注前一项任务, 其中一项关键的挑战是在复杂的运动面前处理高动态范围(HDR)场景。 为此, 我们探索了双重接触感应器的可能好处, 即它们随时提供短短短而模糊的暴露, 且在空间上登记, 其结果在时间上一致。 这样, 运动模糊地记录了现场运动上的时间性连续信息, 加上尖锐的参照, 使得能够在拍摄的单一相机中进行更精确的动作取样。 我们证明这有利于在VFI任务中进行更复杂的运动, 以及《人类发展报告》框架的重建, 迄今只考虑的是最初拍摄的框架, 而不是在相互交错的框架中进行。 我们设计了一个在这些任务中受过训练的神经网络, 明显超越了现有解决方案。 我们还提出了一个现场运动复杂性的测量仪, 以重要地洞察到 VFI 方法在测试时的表现。