Detecting spliced images is one of the emerging challenges in computer vision. Unlike prior methods that focus on detecting low-level artifacts generated during the manipulation process, we use an image retrieval approach to tackle this problem. When given a spliced query image, our goal is to retrieve the original image from a database of authentic images. To achieve this goal, we propose representing an image by its constituent objects based on the intuition that the finest granularity of manipulations is oftentimes at the object-level. We introduce a framework, object embeddings for spliced image retrieval (OE-SIR), that utilizes modern object detectors to localize object regions. Each region is then embedded and collectively used to represent the image. Further, we propose a student-teacher training paradigm for learning discriminative embeddings within object regions to avoid expensive multiple forward passes. Detailed analysis of the efficacy of different feature embedding models is also provided in this study. Extensive experimental results show that the OE-SIR achieves state-of-the-art performance in spliced image retrieval.
翻译:检测相片图像是计算机视觉中新出现的挑战之一。 与以往侧重于检测在操作过程中产生的低水平文物的方法不同, 我们使用图像检索方法来解决这一问题。 当给定一个拼图时, 我们的目标是从真实图像数据库中检索原始图像。 为了实现这一目标, 我们提议根据直觉来代表图像的构成对象, 即操纵中最优的微粒往往出现在目标层面。 我们引入了一个框架, 即用于切图检索的物体嵌入器( OE- SIR), 利用现代对象探测器将目标区域本地化。 每个区域随后嵌入并集体使用来代表图像。 此外, 我们提议了一个师生培训模式, 用于学习目标区域内的歧视性嵌入, 以避免昂贵的多重前方通道。 本研究中还提供了对不同特性嵌入模型效能的详细分析。 广泛的实验结果显示, OE- SIR 实现插图检索中的最新功能。