Learning compact representation is vital and challenging for large scale multimedia data. Cross-view/cross-modal hashing for effective binary representation learning has received significant attention with exponentially growing availability of multimedia content. Most existing cross-view hashing algorithms emphasize the similarities in individual views, which are then connected via cross-view similarities. In this work, we focus on the exploitation of the discriminative information from different views, and propose an end-to-end method to learn semantic-preserving and discriminative binary representation, dubbed Discriminative Cross-View Hashing (DCVH), in light of learning multitasking binary representation for various tasks including cross-view retrieval, image-to-image retrieval, and image annotation/tagging. The proposed DCVH has the following key components. First, it uses convolutional neural network (CNN) based nonlinear hashing functions and multilabel classification for both images and texts simultaneously. Such hashing functions achieve effective continuous relaxation during training without explicit quantization loss by using Direct Binary Embedding (DBE) layers. Second, we propose an effective view alignment via Hamming distance minimization, which is efficiently accomplished by bit-wise XOR operation. Extensive experiments on two image-text benchmark datasets demonstrate that DCVH outperforms state-of-the-art cross-view hashing algorithms as well as single-view image hashing algorithms. In addition, DCVH can provide competitive performance for image annotation/tagging.
翻译:对大型多媒体数据而言,学习的缩略语对于大型多媒体数据至关重要且具有挑战性。 交叉视图/跨模式的散列对于有效的二进制学习已受到极大关注,因为多媒体内容的可用性急剧增加。 大多数现有的交叉视图散列算法强调个人观点的相似性,然后通过交叉视图的相似性连接这些观点。 在这项工作中,我们侧重于利用来自不同观点的歧视性信息,并提议一种端对端方法来学习语义保存和歧视性的二进制代表制(DCVH),这是在学习多种任务,包括交叉视图检索、图像到图像检索和图像注释/粘贴。 拟议的DCVH有以下关键组成部分。 首先,我们利用基于非线性观点的进化神经网络功能和多标签分类同时学习图像和文本。 在培训期间,由于使用直接二进制的嵌入(DBE)二进制二进制二进制的二进制二进制的双进制结构,我们提出一个有效的跨进制图像调整,然后,我们建议通过S-RO 快速的进制的进式图像,通过S-进式的进式的进式图像,我们提出一个通过S- 进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进式的进制数据。