Video frame interpolation (VFI) increases the video frame rate by inserting a reconstruction frame into two consecutive frames. Due to the limitation of the fixed frame rate of ordinary camera, the frame-only video frame interpolation methods inevitably lose the dynamics in the interval between consecutive frames. In order to compensate for the lack of inter-frame information, motion models are often used, but those models cannot account for the real motions. Event cameras are bio-inspired vision sensor, each pixel of which independently perceives and encodes relative changes in light intensity. Event cameras output sparse, asynchronous streams of events instead of frames, with advantages of high temporal resolution, high dynamics, and low power consumption. An event is usually expressed as a tuple e=(x,y,p,t), which means that at timestamp t, an event with polarity is generated at the pixel (x,y). Positive polarity indicates that the change of light intensity from week to strong is beyond the threshold, while negative polarity is just the opposite. Because an event camera has high temporal resolution up to microseconds, it can capture complete changes or motion between frames. The event flow is the embodiment of inter-frame changes. Therefore, the optical flow estimated from the events does not require any motion model to be fitted, which can be inherently nonlinear. Since events lack intensity information, frame-based optical flow is complementary to event-based optical flow. By combining these two kinds of optical flow, more accurate estimation results can be obtained. Meanwhile, it is possible to reconstruct high-quality keyframes at any timestamp, since real inter-frame dynamics are captured.
翻译:暂无翻译