Recently, Google proposes DDVM which for the first time demonstrates that a general diffusion model for image-to-image translation task works impressively well on optical flow estimation task without any specific designs like RAFT. However, DDVM is still a closed-source model with the expensive and private Palette-style pretraining. In this technical report, we present the first open-source DDVM by reproducing it. We study several design choices and find those important ones. By training on 40k public data with 4 GPUs, our reproduction achieves comparable performance to the closed-source DDVM. The code and model have been released in https://github.com/DQiaole/FlowDiffusion_pytorch.
翻译:暂无翻译