A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of Generative Adversarial Networks has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.
翻译:瞳孔图像可以完全从根据人类视觉系统的视网膜敏感性分布的稀疏样本集中重建,该敏感性随着偏心率的增加而迅速降低。最近显示利用生成对抗网络已成为这种任务的一种有前途的解决方案,因为它们可以成功地虚构缺失的图像信息。与其他监督学习方法一样,损失函数的定义和训练策略对输出质量产生重大影响。在这项工作中,我们考虑如何高效指导眼底重建技术的训练,使其更加了解人类视觉系统的能力和限制,从而能够重建视觉上重要的图像特征。我们的主要目标是使训练过程对人类无法检测到的失真不那么敏感,并强调对有感知重要性的伪影进行惩罚。考虑到GAN解决方案的性质,我们关注于人类视觉在输入样本密度不同的情况下对幻觉的敏感性。我们提出了心理物理实验、数据集和一个用于训练瞳孔图像重建的过程。所提出的策略通过仅惩罚输出中的感知重要偏差来使生成器网络更加灵活,从而强调恢复感知重要的图像特征。我们使用新训练的客观度量、最近的瞳孔视频质量度量和用户实验来评估我们的策略并将其与备选解决方案进行比较。我们的评估显示,与标准GAN训练方法相比,在感知图像重建质量方面出现了显著改善。