BiFormer: 基于双侧 Transformer 的双侧运动估计学习，用于 4K 视频帧插值 (BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation) - 专知论文

会员服务 ·

0

BiFormer · 运动估计 · 全局运动估计 · 变换 · 合成 ·

2023 年 4 月 5 日

BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation

翻译：BiFormer: 基于双侧 Transformer 的双侧运动估计学习，用于 4K 视频帧插值

Junheum Park,Jintae Kim,Chang-Su Kim

from arxiv, Accepted to CVPR2023

A novel 4K video frame interpolator based on bilateral transformer (BiFormer) is proposed in this paper, which performs three steps: global motion estimation, local motion refinement, and frame synthesis. First, in global motion estimation, we predict symmetric bilateral motion fields at a coarse scale. To this end, we propose BiFormer, the first transformer-based bilateral motion estimator. Second, we refine the global motion fields efficiently using blockwise bilateral cost volumes (BBCVs). Third, we warp the input frames using the refined motion fields and blend them to synthesize an intermediate frame. Extensive experiments demonstrate that the proposed BiFormer algorithm achieves excellent interpolation performance on 4K datasets. The source codes are available at https://github.com/JunHeum/BiFormer.

翻译：本文提出了一种基于双侧 Transformer 的 4K 视频帧插值器 BiFormer，它执行三个步骤：全局运动估计、局部运动细化和帧合成。首先，在全局运动估计中，我们在粗略尺度上预测对称的双侧运动场。为此，我们提出了 BiFormer，第一个基于 Transformer 的双侧运动估计器。其次，我们使用块状双侧代价体量（BBCVs）高效地细化全局运动场。第三，我们使用细化的运动场对输入帧进行变形，然后混合它们以合成中间帧。广泛的实验表明，提出的 BiFormer 算法在四个数据集上都获得了出色的插值性能。源代码可在 https://github.com/JunHeum/BiFormer 获得。

0

相关内容

BiFormer

【CVPR2023】基于图像特定提示学习的零样本生成模型自适应

【CVPR2023】基于图像特定提示学习的零样本生成模型自适应

专知会员服务

31+阅读 · 2023年4月7日

【CVPR2023】BiFormer:基于双层路由注意力的视觉Transformer

【CVPR2023】BiFormer:基于双层路由注意力的视觉Transformer

专知会员服务

35+阅读 · 2023年3月20日

【CVPR2022】基于密集学习的半监督目标检测

【CVPR2022】基于密集学习的半监督目标检测

专知会员服务

20+阅读 · 2022年4月19日

【CVPR2022】端到端实时矢量边缘提取（E2EC）

【CVPR2022】端到端实时矢量边缘提取（E2EC）

专知会员服务

16+阅读 · 2022年4月14日

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

【ICCV2021】用于目标检测和实例分割的新损失函数

专知会员服务

22+阅读 · 2021年7月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

图像处理：从 bilateral filter 到 HDRnet

图像处理：从 bilateral filter 到 HDRnet

极市平台

30+阅读 · 2019年8月7日

【泡泡一分钟】三维卷积神经网络实现实时非模态三维目标检测

【泡泡一分钟】三维卷积神经网络实现实时非模态三维目标检测

泡泡机器人SLAM

12+阅读 · 2019年5月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

泡泡机器人SLAM

25+阅读 · 2019年1月17日

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

泡泡机器人SLAM

11+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

泡泡机器人SLAM

11+阅读 · 2018年3月31日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

基于特征学习的空间非合作目标单目视觉位姿测量研究

国家自然科学基金

2+阅读 · 2015年12月31日

近临界随机环境中随机游动的若干极限性质

国家自然科学基金

0+阅读 · 2015年12月31日

基于人眼视觉特性与ASIFT的多尺度变换域视频水印算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于运动学正解和自适应逆控制的并联机构波形复现研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维环境中随机核矩阵的谱分析

国家自然科学基金

0+阅读 · 2013年12月31日

Kalirin 7 在雌激素调节海马神经元可塑性中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

脊髓运动神经元的突触可塑性及其细胞和分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于时空流形学习与概率图模型的人体动作识别

国家自然科学基金

2+阅读 · 2012年12月31日

E3泛素连接酶Synoviolin对胰岛β细胞功能的影响

国家自然科学基金

0+阅读 · 2012年12月31日

流形空间中影像控制结构的嵌入和匹配研究

国家自然科学基金

0+阅读 · 2011年12月31日

HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation

Arxiv

0+阅读 · 2023年5月22日

READMem: Robust Embedding Association for a Diverse Memory in Unconstrained Video Object Segmentation

Arxiv

0+阅读 · 2023年5月22日

Efficient Mixed Transformer for Single Image Super-Resolution

Arxiv

0+阅读 · 2023年5月19日

T-former: An Efficient Transformer for Image Inpainting

Arxiv

0+阅读 · 2023年5月19日

Enhancing Transformer Backbone for Egocentric Video Action Segmentation

Arxiv

0+阅读 · 2023年5月19日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Interpretable CNNs for Object Classification

Interpretable CNNs for Object Classification

Arxiv

20+阅读 · 2020年3月12日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Arxiv

13+阅读 · 2018年1月6日

VIP会员

文章信息

相关主题

全局运动估计

相关VIP内容

【CVPR2023】基于图像特定提示学习的零样本生成模型自适应

【CVPR2023】基于图像特定提示学习的零样本生成模型自适应

专知会员服务

31+阅读 · 2023年4月7日

【CVPR2023】BiFormer:基于双层路由注意力的视觉Transformer

【CVPR2023】BiFormer:基于双层路由注意力的视觉Transformer

专知会员服务

35+阅读 · 2023年3月20日

【CVPR2022】基于密集学习的半监督目标检测

【CVPR2022】基于密集学习的半监督目标检测

专知会员服务

20+阅读 · 2022年4月19日

【CVPR2022】端到端实时矢量边缘提取（E2EC）

【CVPR2022】端到端实时矢量边缘提取（E2EC）

专知会员服务

16+阅读 · 2022年4月14日

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

【ICCV2021】用于目标检测和实例分割的新损失函数

专知会员服务

22+阅读 · 2021年7月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

图像处理：从 bilateral filter 到 HDRnet

图像处理：从 bilateral filter 到 HDRnet

极市平台

30+阅读 · 2019年8月7日

【泡泡一分钟】三维卷积神经网络实现实时非模态三维目标检测

【泡泡一分钟】三维卷积神经网络实现实时非模态三维目标检测

泡泡机器人SLAM

12+阅读 · 2019年5月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

泡泡机器人SLAM

25+阅读 · 2019年1月17日

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

泡泡机器人SLAM

11+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

【泡泡一分钟】基于多视图卷积网络的草图三维重建技术(3dv-66)

泡泡机器人SLAM

11+阅读 · 2018年3月31日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

相关论文

HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation

Arxiv

0+阅读 · 2023年5月22日

READMem: Robust Embedding Association for a Diverse Memory in Unconstrained Video Object Segmentation

Arxiv

0+阅读 · 2023年5月22日

Efficient Mixed Transformer for Single Image Super-Resolution

Arxiv

0+阅读 · 2023年5月19日

T-former: An Efficient Transformer for Image Inpainting

Arxiv

0+阅读 · 2023年5月19日

Enhancing Transformer Backbone for Egocentric Video Action Segmentation

Arxiv

0+阅读 · 2023年5月19日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Interpretable CNNs for Object Classification

Interpretable CNNs for Object Classification

Arxiv

20+阅读 · 2020年3月12日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Arxiv

13+阅读 · 2018年1月6日

相关基金

基于特征学习的空间非合作目标单目视觉位姿测量研究

国家自然科学基金

2+阅读 · 2015年12月31日

近临界随机环境中随机游动的若干极限性质

国家自然科学基金

0+阅读 · 2015年12月31日

基于人眼视觉特性与ASIFT的多尺度变换域视频水印算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于运动学正解和自适应逆控制的并联机构波形复现研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维环境中随机核矩阵的谱分析

国家自然科学基金

0+阅读 · 2013年12月31日

Kalirin 7 在雌激素调节海马神经元可塑性中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

脊髓运动神经元的突触可塑性及其细胞和分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于时空流形学习与概率图模型的人体动作识别

国家自然科学基金

2+阅读 · 2012年12月31日

E3泛素连接酶Synoviolin对胰岛β细胞功能的影响

国家自然科学基金

0+阅读 · 2012年12月31日

流形空间中影像控制结构的嵌入和匹配研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员