In most image retrieval systems, images include various high-level semantics, called tags or annotations. Virtually all the state-of-the-art image annotation methods that handle imbalanced labeling are search-based techniques which are time-consuming. In this paper, a novel coupled dictionary learning approach is proposed to learn a limited number of visual prototypes and their corresponding semantics simultaneously. This approach leads to a real-time image annotation procedure. Another contribution of this paper is that utilizes a marginalized loss function instead of the squared loss function that is inappropriate for image annotation with imbalanced labels. We have employed a marginalized loss function in our method to leverage a simple and effective method of prototype updating. Meanwhile, we have introduced ${\ell}_1$ regularization on semantic prototypes to preserve the sparse and imbalanced nature of labels in learned semantic prototypes. Finally, comprehensive experimental results on various datasets demonstrate the efficiency of the proposed method for image annotation tasks in terms of accuracy and time. The reference implementation is publicly available on https://github.com/hamid-amiri/MCDL-Image-Annotation.
翻译:在大多数图像检索系统中,图像包含多种高层语义,称为标签或注释。目前几乎所有处理不平衡标签的最先进图像注释方法都是基于搜索的技术,耗时较长。本文提出了一种新颖的耦合字典学习方法,可以同时学习有限数量的视觉原型及其相应的语义,从而实现实时图像注释。本文的另一个贡献是使用边缘损失函数代替对于不平衡标签的图像注释而言不适用于的平方损失函数。本方法中采用了边缘化损失函数以利用一种简单有效的原型更新方法。同时,我们在语义原型上引入了${\ell}_1$正则化,以保持学习到的语义原型稀疏和不平衡的特性。最后,通过在各种数据集上进行全面的实验结果,证明了所提出的方法在准确度和时间方面对图像注释任务的高效性。参考实现已经公开在https://github.com/hamid-amiri/MCDL-Image-Annotation。