基于分块的检索：面向实例级匹配的实用技术集 (Patch-wise Retrieval: A Bag of Practical Techniques for Instance-level Matching)

Instance-level image retrieval aims to find images containing the same object as a given query, despite variations in size, position, or appearance. To address this challenging task, we propose Patchify, a simple yet effective patch-wise retrieval framework that offers high performance, scalability, and interpretability without requiring fine-tuning. Patchify divides each database image into a small number of structured patches and performs retrieval by comparing these local features with a global query descriptor, enabling accurate and spatially grounded matching. To assess not just retrieval accuracy but also spatial correctness, we introduce LocScore, a localization-aware metric that quantifies whether the retrieved region aligns with the target object. This makes LocScore a valuable diagnostic tool for understanding and improving retrieval behavior. We conduct extensive experiments across multiple benchmarks, backbones, and region selection strategies, showing that Patchify outperforms global methods and complements state-of-the-art reranking pipelines. Furthermore, we apply Product Quantization for efficient large-scale retrieval and highlight the importance of using informative features during compression, which significantly boosts performance. Project website: https://wons20k.github.io/PatchwiseRetrieval/

翻译：实例级图像检索旨在根据给定查询图像，在尺寸、位置或外观存在差异的情况下，找到包含相同对象的图像。为应对这一挑战性任务，我们提出了Patchify——一种简单而有效的分块检索框架，该框架无需微调即可实现高性能、可扩展性和可解释性。Patchify将每张数据库图像划分为少量结构化图像块，并通过将这些局部特征与全局查询描述符进行比较来执行检索，从而实现精确且具有空间定位能力的匹配。为评估检索准确性及空间正确性，我们引入了LocScore，这是一种定位感知度量指标，用于量化检索区域是否与目标对象对齐，使其成为理解和改进检索行为的重要诊断工具。我们在多个基准数据集、骨干网络及区域选择策略上进行了广泛实验，结果表明Patchify优于全局方法，并能与先进的重新排序流程形成互补。此外，我们应用乘积量化技术以实现高效的大规模检索，并强调了在压缩过程中使用信息丰富特征的重要性，这显著提升了性能。项目网站：https://wons20k.github.io/PatchwiseRetrieval/