Refusion: 使用潜空间扩散模型实现大型真实图像修复 (Refusion: Enabling Large-Size Realistic Image Restoration with Latent-Space Diffusion Models) - 专知论文

会员服务 ·

0

图像修复 · 扩散模型 · 阴影去除 · U-Net · 去雾 ·

2023 年 4 月 17 日

Refusion: Enabling Large-Size Realistic Image Restoration with Latent-Space Diffusion Models

翻译：Refusion: 使用潜空间扩散模型实现大型真实图像修复

Ziwei Luo,Fredrik K. Gustafsson,Zheng Zhao,Jens Sjölund,Thomas B. Schön

from arxiv, CVPRW 2023. Runner-up method in NTIRE 2023 Image Shadow Removal Challenge. Code is available at https://github.com/Algolzw/image-restoration-sde

This work aims to improve the applicability of diffusion models in realistic image restoration. Specifically, we enhance the diffusion model in several aspects such as network architecture, noise level, denoising steps, training image size, and optimizer/scheduler. We show that tuning these hyperparameters allows us to achieve better performance on both distortion and perceptual scores. We also propose a U-Net based latent diffusion model which performs diffusion in a low-resolution latent space while preserving high-resolution information from the original input for the decoding process. Compared to the previous latent-diffusion model which trains a VAE-GAN to compress the image, our proposed U-Net compression strategy is significantly more stable and can recover highly accurate images without relying on adversarial optimization. Importantly, these modifications allow us to apply diffusion models to various image restoration tasks, including real-world shadow removal, HR non-homogeneous dehazing, stereo super-resolution, and bokeh effect transformation. By simply replacing the datasets and slightly changing the noise network, our model, named Refusion, is able to deal with large-size images (e.g., 6000 x 4000 x 3 in HR dehazing) and produces good results on all the above restoration problems. Our Refusion achieves the best perceptual performance in the NTIRE 2023 Image Shadow Removal Challenge and wins 2nd place overall.

翻译：本文旨在改进扩散模型在真实图像修复中的适用性。具体而言，我们改善了扩散模型在网络架构、噪声水平、降噪步骤、训练图像大小和优化器/调度器等方面的几个方面。我们展示了调整这些超参数能够在失真度和感知分数方面实现更好的性能。我们还提出了基于U-Net的潜空间扩散模型，它可以在低分辨率潜空间中执行扩散，同时保留原始输入的高分辨率信息以进行解码过程。与以前的潜空间扩散模型相比，其训练一个VAE-GAN压缩图像的模式相比，我们提出的U-Net压缩策略更加稳定，并且可以不依赖对抗优化恢复高度准确的图像。重要的是，这些修改使我们能够将扩散模型应用于各种图像修复任务，包括实际阴影去除、HR非均匀去雾、立体超分辨率和bokeh效果变换。通过仅更换数据集并轻微更改噪声网络，我们命名的模型Refusion能够处理大型图像（例如，HR去雾中的6000 x 4000 x 3），并在所有上述修复问题上产生良好的结果。我们的Refusion在NTIRE 2023图像阴影去除挑战赛中取得了最佳感知性能，并获得了第二名。

0

相关内容

图像修复

图像修复（英语：Inpainting）指重建的图像和视频中丢失或损坏的部分的过程。例如在博物馆中，这项工作常由经验丰富的博物馆管理员或者艺术品修复师来进行。数码世界中，图像修复又称图像插值或视频插值，指利用复杂的算法来替换已丢失、损坏的图像数据，主要替换一些小区域和瑕疵。

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

专知会员服务

17+阅读 · 2022年3月28日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

【ICCV 2021】HCFlow：使用一个统一的框架处理图像超分辨率和图像再缩放

专知会员服务

15+阅读 · 2021年10月4日

ICCV2021 RealVSR: 业界首个移动端真实场景视频超分数据集

专知会员服务

24+阅读 · 2021年9月28日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【CVPR2020-Oral-牛津-Facebook】从单个图像进行端到端的视图合成，SynSin-View Synthesis

【CVPR2020-Oral-牛津-Facebook】从单个图像进行端到端的视图合成，SynSin-View Synthesis

专知会员服务

29+阅读 · 2020年3月26日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

生成扩散模型漫谈：最优扩散方差估计（上）

生成扩散模型漫谈：最优扩散方差估计（上）

PaperWeekly

0+阅读 · 2022年9月25日

可生成高清视频的Stable Diffusion来了！分辨率提升4倍，超分算法来自腾讯，支持Colab在线试玩

可生成高清视频的Stable Diffusion来了！分辨率提升4倍，超分算法来自腾讯，支持Colab在线试玩

量子位

0+阅读 · 2022年9月18日

ECCV 2020 | IIAI&谷歌等提出MIRNet：用于真实图像的恢复和增强

ECCV 2020 | IIAI&谷歌等提出MIRNet：用于真实图像的恢复和增强

CVer

16+阅读 · 2020年10月12日

基于深度学习的超分辨率图像技术一览

基于深度学习的超分辨率图像技术一览

极市平台

17+阅读 · 2019年8月24日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

SRGAN论文笔记

SRGAN论文笔记

统计学习与视觉计算组

109+阅读 · 2018年4月12日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

高性能低比特视觉搜索及芯片结构研究

国家自然科学基金

1+阅读 · 2016年12月31日

SIRT1调控miR-15b-5p转录的新机制及其在结直肠癌转移的作用

国家自然科学基金

0+阅读 · 2015年12月31日

基于混合约束正则化的电阻抗成像反演研究

国家自然科学基金

0+阅读 · 2015年12月31日

陷波频率精确可调的FIR稀疏多频陷波器设计算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于矩阵嵌套稀疏的高强度辐射场飞机内部电磁兼容分析方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

数据驱动的非线性多模态复杂系统性能退化故障预测方法研究

国家自然科学基金

6+阅读 · 2012年12月31日

可加工Ti3SiC2陶瓷与金属Ni、Cr的界面结构与反应机理

国家自然科学基金

0+阅读 · 2012年12月31日

同型半胱氨酸致动脉粥样硬化中"c-myc/miRNAs/FABP4"交互作用分子网络的构建及潜在干预靶位的研究

国家自然科学基金

0+阅读 · 2012年12月31日

图像恢复和填补中的新的模型与算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

利用GPS与IM/WS干涉测量监测鲜水河断层变形

国家自然科学基金

0+阅读 · 2008年12月31日

Diffusion Self-Guidance for Controllable Image Generation

Arxiv

0+阅读 · 2023年6月1日

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Arxiv

0+阅读 · 2023年6月1日

Vocabulary-free Image Classification

Arxiv

0+阅读 · 2023年6月1日

UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model

Arxiv

0+阅读 · 2023年6月1日

A Unified Conditional Framework for Diffusion-based Image Restoration

Arxiv

0+阅读 · 2023年5月31日

Direct Diffusion Bridge using Data Consistency for Inverse Problems

Arxiv

0+阅读 · 2023年5月31日

Image Restoration with Mean-Reverting Stochastic Differential Equations

Arxiv

0+阅读 · 2023年5月31日

Nested Diffusion Processes for Anytime Image Generation

Arxiv

0+阅读 · 2023年5月30日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

专知会员服务

17+阅读 · 2022年3月28日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

【ICCV 2021】HCFlow：使用一个统一的框架处理图像超分辨率和图像再缩放

专知会员服务

15+阅读 · 2021年10月4日

ICCV2021 RealVSR: 业界首个移动端真实场景视频超分数据集

专知会员服务

24+阅读 · 2021年9月28日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【CVPR2020-Oral-牛津-Facebook】从单个图像进行端到端的视图合成，SynSin-View Synthesis

【CVPR2020-Oral-牛津-Facebook】从单个图像进行端到端的视图合成，SynSin-View Synthesis

专知会员服务

29+阅读 · 2020年3月26日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《运用阵营部署粒子滤波器在部分可观测的陆基军事仿真中追踪敌方部队实体位置》2025最新127页

《基于博弈论学习与控制提升复杂自适应系统的韧性》358页

人工智能能否胜任“金穹”的三分钟窗口战争？

《时间受限环境下的规划：连与排级单位的快速规划方法》

相关资讯

生成扩散模型漫谈：最优扩散方差估计（上）

生成扩散模型漫谈：最优扩散方差估计（上）

PaperWeekly

0+阅读 · 2022年9月25日

可生成高清视频的Stable Diffusion来了！分辨率提升4倍，超分算法来自腾讯，支持Colab在线试玩

可生成高清视频的Stable Diffusion来了！分辨率提升4倍，超分算法来自腾讯，支持Colab在线试玩

量子位

0+阅读 · 2022年9月18日

ECCV 2020 | IIAI&谷歌等提出MIRNet：用于真实图像的恢复和增强

ECCV 2020 | IIAI&谷歌等提出MIRNet：用于真实图像的恢复和增强

CVer

16+阅读 · 2020年10月12日

基于深度学习的超分辨率图像技术一览

基于深度学习的超分辨率图像技术一览

极市平台

17+阅读 · 2019年8月24日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

SRGAN论文笔记

SRGAN论文笔记

统计学习与视觉计算组

109+阅读 · 2018年4月12日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

相关论文

Diffusion Self-Guidance for Controllable Image Generation

Arxiv

0+阅读 · 2023年6月1日

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Arxiv

0+阅读 · 2023年6月1日

Vocabulary-free Image Classification

Arxiv

0+阅读 · 2023年6月1日

UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model

Arxiv

0+阅读 · 2023年6月1日

A Unified Conditional Framework for Diffusion-based Image Restoration

Arxiv

0+阅读 · 2023年5月31日

Direct Diffusion Bridge using Data Consistency for Inverse Problems

Arxiv

0+阅读 · 2023年5月31日

Image Restoration with Mean-Reverting Stochastic Differential Equations

Arxiv

0+阅读 · 2023年5月31日

Nested Diffusion Processes for Anytime Image Generation

Arxiv

0+阅读 · 2023年5月30日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

相关基金

高性能低比特视觉搜索及芯片结构研究

国家自然科学基金

1+阅读 · 2016年12月31日

SIRT1调控miR-15b-5p转录的新机制及其在结直肠癌转移的作用

国家自然科学基金

0+阅读 · 2015年12月31日

基于混合约束正则化的电阻抗成像反演研究

国家自然科学基金

0+阅读 · 2015年12月31日

陷波频率精确可调的FIR稀疏多频陷波器设计算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于矩阵嵌套稀疏的高强度辐射场飞机内部电磁兼容分析方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

数据驱动的非线性多模态复杂系统性能退化故障预测方法研究

国家自然科学基金

6+阅读 · 2012年12月31日

可加工Ti3SiC2陶瓷与金属Ni、Cr的界面结构与反应机理

国家自然科学基金

0+阅读 · 2012年12月31日

同型半胱氨酸致动脉粥样硬化中"c-myc/miRNAs/FABP4"交互作用分子网络的构建及潜在干预靶位的研究

国家自然科学基金

0+阅读 · 2012年12月31日

图像恢复和填补中的新的模型与算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

利用GPS与IM/WS干涉测量监测鲜水河断层变形

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员