We propose to incorporate feature correlation and sequential processing into dense optical flow estimation from event cameras. Modern frame-based optical flow methods heavily rely on matching costs computed from feature correlation. In contrast, there exists no optical flow method for event cameras that explicitly computes matching costs. Instead, learning-based approaches using events usually resort to the U-Net architecture to estimate optical flow sparsely. Our key finding is that the introduction of correlation features significantly improves results compared to previous methods that solely rely on convolution layers. Compared to the state-of-the-art, our proposed approach computes dense optical flow and reduces the end-point error by 23% on MVSEC. Furthermore, we show that all existing optical flow methods developed so far for event cameras have been evaluated on datasets with very small displacement fields with a maximum flow magnitude of 10 pixels. Based on this observation, we introduce a new real-world dataset that exhibits displacement fields with magnitudes up to 210 pixels and 3 times higher camera resolution. Our proposed approach reduces the end-point error on this dataset by 66%.
翻译:我们建议将特征相关性和相继处理纳入事件相机的密集光流估计。 现代基于框架的光流方法在很大程度上依赖于根据特征相关性计算的匹配成本。 相反,对于事件相机来说,没有明确计算匹配成本的光流方法。 相反, 以学习为基础的方法通常使用U-Net结构来估计光流。 我们的关键发现是, 引入关联性可以大大改善结果, 与以前完全依赖卷动层的方法相比。 与最新技术相比, 我们拟议方法计算了密集光流,并将MVSEC的终点错误减少23%。 此外, 我们显示, 迄今为事件相机开发的所有光流方法都已经对极小的移动场数据集进行了评估, 最大流积10个像素。 基于这一观察, 我们引入了新的真实世界数据集, 显示的迁移场的尺寸高达210个像素, 高3倍于摄像分辨率。 我们拟议的方法将这一数据设置的终点错误减少66%。