经过感知优化和自校准的色调映射算子 (A Perceptually Optimized and Self-Calibrated Tone Mapping Operator)

With the increasing popularity and accessibility of high dynamic range (HDR) photography, tone mapping operators (TMOs) for dynamic range compression are practically demanding. In this paper, we develop a two-stage neural network-based TMO that is self-calibrated and perceptually optimized. In Stage one, motivated by the physiology of the early stages of the human visual system, we first decompose an HDR image into a normalized Laplacian pyramid. We then use two lightweight deep neural networks (DNNs), taking the normalized representation as input and estimating the Laplacian pyramid of the corresponding LDR image. We optimize the tone mapping network by minimizing the normalized Laplacian pyramid distance (NLPD), a perceptual metric aligning with human judgments of tone-mapped image quality. In Stage two, the input HDR image is self-calibrated to compute the final LDR image. We feed the same HDR image but rescaled with different maximum luminances to the learned tone mapping network, and generate a pseudo-multi-exposure image stack with different detail visibility and color saturation. We then train another lightweight DNN to fuse the LDR image stack into a desired LDR image by maximizing a variant of the structural similarity index for multi-exposure image fusion (MEF-SSIM), which has been proven perceptually relevant to fused image quality. The proposed self-calibration mechanism through MEF enables our TMO to accept uncalibrated HDR images, while being physiology-driven. Extensive experiments show that our method produces images with consistently better visual quality. Additionally, since our method builds upon three lightweight DNNs, it is among the fastest local TMOs.

翻译：随着高动态范围（HDR）摄影的日益普及和可访问性，动态范围压缩的色调映射算子（TMO）变得非常重要。本文提出了一个基于两个阶段的神经网络的TMO，该算子是经过感知优化和自校准的。第一阶段，受人类视觉系统早期阶段的生理机制启发，我们首先将HDR图像分解为归一化拉普拉斯金字塔。然后我们使用两个轻量级深度神经网络（DNNs），以归一化表示为输入，并估计相应LDR图像的拉普拉斯金字塔。我们通过最小化归一化拉普拉斯金字塔距离（NLPD）来优化色调映射网络，这是一种与调制图像质量的人类判断相一致的感知指标。第二阶段，输入HDR图像进行自校准以计算最终的LDR图像。我们将相同的HDR图像但不同的最大亮度重新调整大小并输入到训练好的色调映射网络中，以生成带有不同细节可见度和色彩饱和度的伪多曝光图像堆栈。然后我们训练另一个轻量级DNN将LDR图像堆栈融合为所需的LDR图像，通过最大化多曝光图像融合（MEF-SSIM）的变体来实现。这个变体已被证明与融合图像质量相关性高。所提出的通过MEF的自校准机制使我们的TMO能够接受未校准的HDR图像，并且是以生理驱动的方式进行的。广泛的实验表明，我们的方法产生的图像具有更好的视觉质量。此外，由于我们的方法基于三个轻量级DNN，因此它是最快的本地TMO之一。