High dynamic range (HDR) deghosting algorithms aim to generate ghost-free HDR images with realistic details. Restricted by the locality of the receptive field, existing CNN-based methods are typically prone to producing ghosting artifacts and intensity distortions in the presence of large motion and severe saturation. In this paper, we propose a novel Context-Aware Vision Transformer (CA-ViT) for ghost-free high dynamic range imaging. The CA-ViT is designed as a dual-branch architecture, which can jointly capture both global and local dependencies. Specifically, the global branch employs a window-based Transformer encoder to model long-range object movements and intensity variations to solve ghosting. For the local branch, we design a local context extractor (LCE) to capture short-range image features and use the channel attention mechanism to select informative local details across the extracted features to complement the global branch. By incorporating the CA-ViT as basic components, we further build the HDR-Transformer, a hierarchical network to reconstruct high-quality ghost-free HDR images. Extensive experiments on three benchmark datasets show that our approach outperforms state-of-the-art methods qualitatively and quantitatively with considerably reduced computational budgets. Codes are available at https://github.com/megvii-research/HDR-Transformer
翻译:高动态范围(HDR)驱散功能算法(HDR)旨在产生无幽灵的《人类发展报告》图像,并提供现实细节。受接受域所在地的限制,现有CNN使用的方法通常容易在出现大规模运动和严重饱和的情况下产生幽灵文物和强度扭曲。在本文中,我们提议为无幽灵的高动态图像设计一个新的“环境-软件视野变异器(CA-VIT)”(CA-ViT)(CA-ViT)(CA-ViT)(CA-Vilanch )(CA-Vench)(Crash-Front)(Crash-T)(Craft-Transform)结构,可以共同捕捉全球和地方依赖者。具体地说,全球分支处使用一个基于窗口的变异变变变变变码器来模拟远程物体运动和强度变异异。对于当地分支来说,我们设计了一个本地环境提取短图像的提取器(LCEE),并使用频道关注机制来选择所提取的本地细节,以补充全球分支。通过CA-Vial-Tradestreformax-commods-de-de-demogresmal-regresmal-commal-commal-commal-regylgress