基于GPT的图像生成模型在图像修复中的初步研究 (A Preliminary Study on GPT-Image Generation Model for Image Restoration)

Recent advances in OpenAI's GPT-series multimodal generation models have shown remarkable capabilities in producing visually compelling images. In this work, we investigate its potential impact on the image restoration community. We provide, to the best of our knowledge, the first systematic benchmark across diverse restoration scenarios. Our evaluation shows that, while the restoration results generated by GPT-Image models are often perceptually pleasant, they tend to lack pixel-level structural fidelity compared with ground-truth references. Typical deviations include changes in image geometry, object positions or counts, and even modifications in perspective. Beyond empirical observations, we further demonstrate that outputs from GPT-Image models can act as strong visual priors, offering notable performance improvements for existing restoration networks. Using dehazing, deraining, and low-light enhancement as representative case studies, we show that integrating GPT-generated priors significantly boosts restoration quality. This study not only provides practical insights and a baseline framework for incorporating GPT-based generative priors into restoration pipelines, but also highlights new opportunities for bridging image generation models and restoration tasks. To support future research, we will release GPT-restored results.

翻译：OpenAI的GPT系列多模态生成模型的最新进展在生成视觉吸引力强的图像方面展现出卓越能力。本研究探讨了其对图像修复领域的潜在影响。据我们所知，我们首次提供了涵盖多种修复场景的系统性基准测试。评估结果表明，虽然GPT-Image模型生成的修复结果在感知上通常令人满意，但与真实参考图像相比，它们往往缺乏像素级的结构保真度。典型的偏差包括图像几何形状的改变、物体位置或数量的变化，甚至视角的修改。除了实证观察外，我们进一步证明GPT-Image模型的输出可以作为强大的视觉先验，为现有修复网络带来显著的性能提升。以去雾、去雨和低光增强为代表案例，我们展示了集成GPT生成的先验能显著提高修复质量。本研究不仅为将基于GPT的生成先验融入修复流程提供了实用见解和基准框架，还凸显了连接图像生成模型与修复任务的新机遇。为支持未来研究，我们将公开GPT修复的结果。