The goal of video motion magnification techniques is to magnify small motions in a video to reveal previously invisible or unseen movement. Its uses extend from bio-medical applications and deepfake detection to structural modal analysis and predictive maintenance. However, discerning small motion from noise is a complex task, especially when attempting to magnify very subtle, often sub-pixel movement. As a result, motion magnification techniques generally suffer from noisy and blurry outputs. This work presents a new state-of-the-art model based on the Swin Transformer, which offers better tolerance to noisy inputs as well as higher-quality outputs that exhibit less noise, blurriness, and artifacts than prior-art. Improvements in output image quality will enable more precise measurements for any application reliant on magnified video sequences, and may enable further development of video motion magnification techniques in new technical fields.
翻译:视频运动放大技术的目标是扩大视频中的小运动,以揭示以往不可见或难以观测的运动。其应用范围从生物医学应用和 Deepfake 检测到结构模态分析和预测性维护。然而,从噪声中区分出小运动是一项复杂的任务,特别是在试图放大非常微妙的、常常是亚像素运动时。因此,运动放大技术通常受到噪声和模糊输出的影响。本研究提出了一种基于 Swin Transformer 的新的最先进模型,具有更好的对噪声输入的容差,并且输出质量更高,显示出比一般已有技术更少的噪音、模糊和伪影。输出图像质量的提高将使得依赖于放大视频序列的任何应用程序具有更精确的测量能力,并可促进在新的技术领域进一步开发视频运动放大技术。