Estimating neural radiance fields (NeRFs) from "ideal" images has been extensively studied in the computer vision community. Most approaches assume optimal illumination and slow camera motion. These assumptions are often violated in robotic applications, where images may contain motion blur, and the scene may not have suitable illumination. This can cause significant problems for downstream tasks such as navigation, inspection, or visualization of the scene. To alleviate these problems, we present E-NeRF, the first method which estimates a volumetric scene representation in the form of a NeRF from a fast-moving event camera. Our method can recover NeRFs during very fast motion and in high-dynamic-range conditions where frame-based approaches fail. We show that rendering high-quality frames is possible by only providing an event stream as input. Furthermore, by combining events and frames, we can estimate NeRFs of higher quality than state-of-the-art approaches under severe motion blur. We also show that combining events and frames can overcome failure cases of NeRF estimation in scenarios where only a few input views are available without requiring additional regularization.
翻译:从“理想”图像中估计神经光亮场(NERFs)在计算机视觉界已经进行了广泛的研究。大多数方法都假定最佳照明和慢镜头运动。这些假设在机器人应用中经常被违反,因为图像可能含有运动模糊,而场景可能没有适当的照明。这可能会给下游任务,如导航、检查或现场视觉化等带来严重问题。为了缓解这些问题,我们介绍了E-NERF,这是从快速移动事件相机中以NERF形式估计体积场景表现的第一个方法。我们的方法可以在非常快速的动作中和在基于框架的方法失败的高动态范围内恢复NERFs。我们表明,只有提供事件流作为投入,才能建立高质量的框架。此外,通过将事件和框架结合起来,我们可以在严重动作模糊的情况下估计NERF的质量高于状态的方法。我们还表明,将事件和框架结合起来可以克服NERF估计的失败案例,因为在这种情况下,只有很少的输入观点,而不需要额外的正规化。