超级缩略图：基于速率失真优化的实时6K图片剪裁 (HyperThumbnail: Real-time 6K Image Rescaling with Rate-distortion Optimization) - 专知论文

会员服务 ·

0

率失真 · 图像重建 · 重建 · 嵌入 · 率失真优化 ·

2023 年 4 月 3 日

HyperThumbnail: Real-time 6K Image Rescaling with Rate-distortion Optimization

翻译：超级缩略图：基于速率失真优化的实时6K图片剪裁

Chenyang Qi,Xin Yang,Ka Leong Cheng,Ying-Cong Chen,Qifeng Chen

from arxiv, Accepted by CVPR 2023; Github Repository: https://github.com/AbnerVictor/HyperThumbnail

Contemporary image rescaling aims at embedding a high-resolution (HR) image into a low-resolution (LR) thumbnail image that contains embedded information for HR image reconstruction. Unlike traditional image super-resolution, this enables high-fidelity HR image restoration faithful to the original one, given the embedded information in the LR thumbnail. However, state-of-the-art image rescaling methods do not optimize the LR image file size for efficient sharing and fall short of real-time performance for ultra-high-resolution (e.g., 6K) image reconstruction. To address these two challenges, we propose a novel framework (HyperThumbnail) for real-time 6K rate-distortion-aware image rescaling. Our framework first embeds an HR image into a JPEG LR thumbnail by an encoder with our proposed quantization prediction module, which minimizes the file size of the embedding LR JPEG thumbnail while maximizing HR reconstruction quality. Then, an efficient frequency-aware decoder reconstructs a high-fidelity HR image from the LR one in real time. Extensive experiments demonstrate that our framework outperforms previous image rescaling baselines in rate-distortion performance and can perform 6K image reconstruction in real time.

翻译：现代图片剪裁旨在将高分辨率（HR）图片嵌入低分辨率（LR）缩略图中，缩略图包含嵌入的HR图像重建所需信息。与传统的超分辨率技术不同，这种方法实现了高保真的HR图像重建以尽可能忠实于原始图像，只需在LR缩略图中嵌入所需信息。然而，现有的图片剪裁方法无法实现LR图像文件大小的最优化以实现高效共享，并且对于超高分辨率（例如6K）图像重建的实时性能不足。为了解决这两个挑战，我们提出了一种新的框架（HyperThumbnail），可以实现实时6K速率失真感知的图像剪裁。我们的框架首先通过带有我们的预测量化模块的编码器将HR图像嵌入JPEG LR缩略图中，该模块最小化了嵌入LR JPEG缩略图的文件大小，同时最大化了HR重建质量。然后，一种高效的频率感知解码器能够从LR图像中实时重建出高保真的HR图像。广泛的实验表明，我们的框架在速率失真性能方面优于之前的图像剪裁基线，并且能够实时执行6K图像重建。

0

相关内容

率失真

【ICML2022】Branchformer:并行MLP-Attention架构，捕捉局部和全局上下文，用于语音识别和理解

【ICML2022】Branchformer:并行MLP-Attention架构，捕捉局部和全局上下文，用于语音识别和理解

专知会员服务

25+阅读 · 2022年7月8日

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

【斯坦福CVPR2022】EG3D:高效的几何感知三维生成对抗网络，EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks

【斯坦福CVPR2022】EG3D:高效的几何感知三维生成对抗网络，EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks

专知会员服务

18+阅读 · 2022年3月15日

【ICCV 2021】HCFlow：使用一个统一的框架处理图像超分辨率和图像再缩放

专知会员服务

15+阅读 · 2021年10月4日

【ICML2021】全局鲁棒神经网络

专知会员服务

22+阅读 · 2021年8月26日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

专知会员服务

44+阅读 · 2019年11月20日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

【论文笔记】Graph U-Nets

【论文笔记】Graph U-Nets

专知

81+阅读 · 2019年11月25日

一行TensorFlow/Keras代码解决真实场景中数据不平衡(imbalanced)问题

一行TensorFlow/Keras代码解决真实场景中数据不平衡(imbalanced)问题

专知

78+阅读 · 2019年5月31日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

SRGAN论文笔记

SRGAN论文笔记

统计学习与视觉计算组

109+阅读 · 2018年4月12日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

学界 | 深度学习在单图像超分辨率上的应用：SRCNN、Perceptual loss、SRResNet

学界 | 深度学习在单图像超分辨率上的应用：SRCNN、Perceptual loss、SRResNet

机器之心

12+阅读 · 2017年11月7日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

GPU加速和风格感知的艺术图像和谐克隆

国家自然科学基金

4+阅读 · 2014年12月31日

高采样率、高量化分辨率一体化全光模数转换关键技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

多分辨率相机及图像超分辨率技术研究

国家自然科学基金

2+阅读 · 2014年12月31日

下一代无线通信系统中的全分集空时分组编码技术

国家自然科学基金

0+阅读 · 2013年12月31日

基于人工禁忌免疫原理的多源遥感图像自动配准研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于多曲面拟合和单帧学习信息的图像超分辨率方法

国家自然科学基金

0+阅读 · 2012年12月31日

地面激光扫描点云和光学图像的球面二次成像模型及多球面组合自动配准研究

国家自然科学基金

1+阅读 · 2012年12月31日

空间编码可控的快速MRI高分辨率图像稀疏重建

国家自然科学基金

1+阅读 · 2012年12月31日

融合颜色和形状的基于水平集的目标轮廓跟踪

国家自然科学基金

0+阅读 · 2009年12月31日

硅基宏孔光辅助电化学快速腐蚀机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

Multi-Modal Mutual Attention and Iterative Interaction for Referring Image Segmentation

Arxiv

0+阅读 · 2023年5月24日

WaveDM: Wavelet-Based Diffusion Models for Image Restoration

Arxiv

0+阅读 · 2023年5月23日

The Rate-Distortion-Perception Trade-off with Side Information

Arxiv

0+阅读 · 2023年5月22日

Conditional Rate-Distortion-Perception Trade-Off

Arxiv

0+阅读 · 2023年5月22日

Survey of Automatic Plankton Image Recognition: Challenges, Existing Solutions and Future Perspectives

Arxiv

0+阅读 · 2023年5月19日

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Arxiv

0+阅读 · 2023年5月19日

Real-time 6K Image Rescaling with Rate-distortion Optimization

Arxiv

0+阅读 · 2023年5月19日

DSFNet: Dual Space Fusion Network for Occlusion-Robust 3D Dense Face Alignment

Arxiv

0+阅读 · 2023年5月19日

AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation

Arxiv

0+阅读 · 2023年5月19日

ResMLP: Feedforward networks for image classification with data-efficient training

Arxiv

12+阅读 · 2021年5月7日

VIP会员

文章信息

相关主题

率失真优化

相关VIP内容

【ICML2022】Branchformer:并行MLP-Attention架构，捕捉局部和全局上下文，用于语音识别和理解

【ICML2022】Branchformer:并行MLP-Attention架构，捕捉局部和全局上下文，用于语音识别和理解

专知会员服务

25+阅读 · 2022年7月8日

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

【斯坦福CVPR2022】EG3D:高效的几何感知三维生成对抗网络，EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks

【斯坦福CVPR2022】EG3D:高效的几何感知三维生成对抗网络，EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks

专知会员服务

18+阅读 · 2022年3月15日

【ICCV 2021】HCFlow：使用一个统一的框架处理图像超分辨率和图像再缩放

专知会员服务

15+阅读 · 2021年10月4日

【ICML2021】全局鲁棒神经网络

专知会员服务

22+阅读 · 2021年8月26日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

专知会员服务

44+阅读 · 2019年11月20日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

热门VIP内容

开通专知VIP会员享更多权益服务

《关于俄乌战争的系列文章》2025最新70页

《军事行动中的人机AI编队本体模型》

更智能的人工智能实现更快速的电磁辐射控制（EMCON）

《俄罗斯常规军队能力现状及重建》2025最新124页

相关资讯

【论文笔记】Graph U-Nets

【论文笔记】Graph U-Nets

专知

81+阅读 · 2019年11月25日

一行TensorFlow/Keras代码解决真实场景中数据不平衡(imbalanced)问题

一行TensorFlow/Keras代码解决真实场景中数据不平衡(imbalanced)问题

专知

78+阅读 · 2019年5月31日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

SRGAN论文笔记

SRGAN论文笔记

统计学习与视觉计算组

109+阅读 · 2018年4月12日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

学界 | 深度学习在单图像超分辨率上的应用：SRCNN、Perceptual loss、SRResNet

学界 | 深度学习在单图像超分辨率上的应用：SRCNN、Perceptual loss、SRResNet

机器之心

12+阅读 · 2017年11月7日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

相关论文

Multi-Modal Mutual Attention and Iterative Interaction for Referring Image Segmentation

Arxiv

0+阅读 · 2023年5月24日

WaveDM: Wavelet-Based Diffusion Models for Image Restoration

Arxiv

0+阅读 · 2023年5月23日

The Rate-Distortion-Perception Trade-off with Side Information

Arxiv

0+阅读 · 2023年5月22日

Conditional Rate-Distortion-Perception Trade-Off

Arxiv

0+阅读 · 2023年5月22日

Survey of Automatic Plankton Image Recognition: Challenges, Existing Solutions and Future Perspectives

Arxiv

0+阅读 · 2023年5月19日

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Arxiv

0+阅读 · 2023年5月19日

Real-time 6K Image Rescaling with Rate-distortion Optimization

Arxiv

0+阅读 · 2023年5月19日

DSFNet: Dual Space Fusion Network for Occlusion-Robust 3D Dense Face Alignment

Arxiv

0+阅读 · 2023年5月19日

AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation

Arxiv

0+阅读 · 2023年5月19日

ResMLP: Feedforward networks for image classification with data-efficient training

Arxiv

12+阅读 · 2021年5月7日

相关基金

GPU加速和风格感知的艺术图像和谐克隆

国家自然科学基金

4+阅读 · 2014年12月31日

高采样率、高量化分辨率一体化全光模数转换关键技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

多分辨率相机及图像超分辨率技术研究

国家自然科学基金

2+阅读 · 2014年12月31日

下一代无线通信系统中的全分集空时分组编码技术

国家自然科学基金

0+阅读 · 2013年12月31日

基于人工禁忌免疫原理的多源遥感图像自动配准研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于多曲面拟合和单帧学习信息的图像超分辨率方法

国家自然科学基金

0+阅读 · 2012年12月31日

地面激光扫描点云和光学图像的球面二次成像模型及多球面组合自动配准研究

国家自然科学基金

1+阅读 · 2012年12月31日

空间编码可控的快速MRI高分辨率图像稀疏重建

国家自然科学基金

1+阅读 · 2012年12月31日

融合颜色和形状的基于水平集的目标轮廓跟踪

国家自然科学基金

0+阅读 · 2009年12月31日

硅基宏孔光辅助电化学快速腐蚀机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员