Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a challenging cross-modal retrieval task. In prior arts, the retrieval is conducted by sorting the distance between the query sketch and each image in the gallery. However, the domain gap and the zero-shot setting make neural networks hard to generalize. This paper tackles the challenges from a new perspective: utilizing gallery image features. We propose a Cluster-then-Retrieve (ClusterRetri) method that performs clustering on the gallery images and uses the cluster centroids as proxies for retrieval. Furthermore, a distribution alignment loss is proposed to align the image and sketch features with a common Gaussian distribution, reducing the domain gap. Despite its simplicity, our proposed method outperforms the state-of-the-art methods by a large margin on popular datasets, e.g., up to 31% and 39% relative improvement of mAP@all on the Sketchy and TU-Berlin datasets.
翻译:Zero-Shot Sletch 图像检索( ZS- SBIR) 是一项具有挑战性的跨模式检索任务。 在以前的艺术中, 检索是通过排序查询草图和画廊中每个图像之间的距离来进行的。 然而, 域间隔和零点设置使得神经网络难以概括。 本文从一个新的角度处理挑战: 利用画廊图像特征 。 我们提出一个在画廊图像上进行分组并使用集聚式中央机器人作为检索的代理物的集群- 搜索( Clusterretri) 方法 。 此外, 提议分配匹配损失是为了将图像和草图特性与通用高斯分布相匹配, 缩小域间距。 尽管它很简单, 我们提出的方法在流行的数据集上大幅度, 例如, 在Sketschy和TU- Berlin数据集上, 高达31% 和39% mAP@all 的相对改进幅度。