Event cameras have recently gained significant traction since they open up new avenues for low-latency and low-power solutions to complex computer vision problems. To unlock these solutions, it is necessary to develop algorithms that can leverage the unique nature of event data. However, the current state-of-the-art is still highly influenced by the frame-based literature, and usually fails to deliver on these promises. In this work, we take this into consideration and propose a novel self-supervised learning pipeline for the sequential estimation of event-based optical flow that allows for the scaling of the models to high inference frequencies. At its core, we have a continuously-running stateful neural model that is trained using a novel formulation of contrast maximization that makes it robust to nonlinearities and varying statistics in the input events. Results across multiple datasets confirm the effectiveness of our method, which establishes a new state of the art in terms of accuracy for approaches trained or optimized without ground truth.
翻译:事件相机最近获得了巨大的牵引力,因为它们打开了低延迟和低功率解决复杂计算机视觉问题的新途径。 为了解开这些解决方案,有必要开发能够利用事件数据独特性质的算法。 但是,目前的最新技术仍然受到基于框架的文献的高度影响,通常无法兑现这些承诺。 在这项工作中,我们考虑到这一点,并提出一个新的自我监督的学习管道,用于对基于事件的光学流进行顺序估计,从而能够将模型推广到高推论频率。在核心方面,我们有一个持续运行的状态神经模型,该模型使用新的对比最大化配方进行训练,使其对非线性因素和输入事件的不同统计数据具有强健性。多套数据集的结果证实了我们方法的有效性,从而在经过培训或优化的没有地面真相的方法的准确性方面建立了新水平。</s>