This paper addresses the problem of semantic-based image retrieval of natural scenes. A typical content-based image retrieval system deals with the query image and images in the dataset as a collection of low-level features and retrieves a ranked list of images based on the similarities between features of the query image and features of images in the image dataset. However, top ranked images in the retrieved list, which have high similarities to the query image, may be different from the query image in terms of the semantic interpretation of the user which is known as the semantic gap. In order to reduce the semantic gap, this paper investigates how natural scene retrieval can be performed using the bag of visual word model and the distribution of local semantic concepts. The paper studies the efficiency of using different approaches for representing the semantic information, depicted in natural scene images, for image retrieval. An extensive experimental work has been conducted to study the efficiency of using semantic information as well as the bag of visual words model for natural and urban scene image retrieval.
翻译:本文论述基于语义的自然场景图像检索问题。 典型的基于内容的图像检索系统处理数据集中的查询图像和图像,作为低级别特征的集合,并检索根据查询图像特征与图像数据集中图像特征之间的相似性而排列的图像排名清单。 然而,检索列表中排名最高的图像与查询图像非常相似,可能与查询图像不同,因为用户的语义解释称为语义差距。 为了缩小语义差距,本文调查如何利用视觉文字模型包和当地语义概念的分布来进行自然场景检索。 论文研究了使用不同方法代表语义信息的效率,这些方法在自然场景图像中被描绘,用于图像检索。 进行了广泛的实验工作,以研究使用语义信息的效率以及自然和城市场景图像检索的视觉文字模型包。