The concept of image similarity is ambiguous, meaning that images that are considered similar in one context might not be in another. This ambiguity motivates the creation of metrics for specific contexts. This work explores the ability of the successful deep perceptual similarity (DPS) metrics to adapt to a given context. Recently, DPS metrics have emerged using the deep features of neural networks for comparing images. These metrics have been successful on datasets that leverage the average human perception in limited settings. But the question remains if they could be adapted to specific contexts of similarity. No single metric can suit all definitions of similarity and previous metrics have been rule-based which are labor intensive to rewrite for new contexts. DPS metrics, on the other hand, use neural networks which might be retrained for each context. However, retraining networks takes resources and might ruin performance on previous tasks. This work examines the adaptability of DPS metrics by training positive scalars for the deep features of pretrained CNNs to correctly measure similarity for different contexts. Evaluation is performed on contexts defined by randomly ordering six image distortions (e.g. rotation) by which should be considered more similar when applied to an image. This also gives insight into whether the features in the CNN is enough to discern different distortions without retraining. Finally, the trained metrics are evaluated on a perceptual similarity dataset to evaluate if adapting to an ordering affects their performance on established scenarios. The findings show that DPS metrics can be adapted with high performance. While the adapted metrics have difficulties with the same contexts as baselines, performance is improved in 99% of cases. Finally, it is shown that the adaption is not significantly detrimental to prior performance on perceptual similarity.
翻译:图像相似性的概念是模糊的,这意味着在一个环境中被视为相似的图像在另一个环境中可能不是相似的。这种模糊性促使我们为特定环境创建指标。本研究探讨了成功的深度感知相似性(DPS)指标适应给定环境的能力。最近,DPS指标应用神经网络的深层特征比较图像。这些指标已经在利用人类感知的数据集中获得了成功。但是问题在于,它们是否能够适应特定的相似性环境。没有单一的指标可以适用于所有相似性定义,以前的指标都是基于规则的,需要大量的工作才能为新环境重写。DPS指标则使用可以根据需要重新训练的神经网络。但是,重新训练神经网络需要资源,并可能破坏以前的任务表现。本文通过训练预训练CNN的深层特征的正标量,以正确测量不同环境下的相似性,来研究DPS指标的可适应性。评估是在将六种图像失真(例如旋转)随机排列的上下文中定义的上下文中进行的,该上下文应考虑何种失真更相似。这也使了解特征是否足以在不重新训练的情况下区分不同的失真。最后,对训练过的指标进行了评估,以评估适应排序是否影响其在已建立情景中的性能。结果表明,DPS指标能够高效地适应性。虽然适应的指标在与基线相同的情境下存在困难,但在99%的情况下性能得到了提升。最后,证明了适应性对感知相似性的先前性能没有显着不利影响。