We study the query-based attack against image retrieval to evaluate its robustness against adversarial examples under the black-box setting, where the adversary only has query access to the top-k ranked unlabeled images from the database. Compared with query attacks in image classification, which produce adversaries according to the returned labels or confidence score, the challenge becomes even more prominent due to the difficulty in quantifying the attack effectiveness on the partial retrieved list. In this paper, we make the first attempt in Query-based Attack against Image Retrieval (QAIR), to completely subvert the top-k retrieval results. Specifically, a new relevance-based loss is designed to quantify the attack effects by measuring the set similarity on the top-k retrieval results before and after attacks and guide the gradient optimization. To further boost the attack efficiency, a recursive model stealing method is proposed to acquire transferable priors on the target model and generate the prior-guided gradients. Comprehensive experiments show that the proposed attack achieves a high attack success rate with few queries against the image retrieval systems under the black-box setting. The attack evaluations on the real-world visual search engine show that it successfully deceives a commercial system such as Bing Visual Search with 98% attack success rate by only 33 queries on average.
翻译:我们在黑盒设置下对图像检索进行基于查询的攻击, 以评价其对黑盒设置下的对抗性实例的强度, 即对手只能查询从数据库获取最高k级未贴标签的图像。 与图像分类中的质疑性攻击相比, 根据返回的标签或信心评分产生对手, 挑战变得更加突出, 因为很难量化部分检索的列表上的攻击效果。 在本文中, 我们首次尝试在基于查询的攻击图像检索( QAIR) 中彻底推翻顶级检索结果。 具体地说, 基于关联性的新损失旨在量化攻击效果, 通过测量攻击前后最上k级检索结果的相似性来量化攻击效果, 并指导梯度优化 。 为了进一步提高攻击效率, 提议了一种循环式盗窃方法, 以获取目标模型上的可转移前缀, 并生成先前制导的梯度。 全面实验显示, 在黑盒设置下的图像检索系统上, 很少询问攻击成功率。 在现实世界 33 视觉搜索结果引擎上, 仅用98 平均搜索引擎成功地欺骗了商业成功率系统 。