项目名称: 面向大数据跨媒体检索的多模态哈希学习方法研究
项目编号: No.61502122
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 其他
项目作者: 翟德明
作者单位: 哈尔滨工业大学
项目金额: 20万元
中文摘要: 为了应对大数据所带来的存储代价大、检索效率低等难题,基于哈希的方法近年来受到了越来越多人的关注和认可。基于哈希的方法最吸引人的特性为将数据表示成二进制的哈希编码来索引,不仅具有更紧致的表示,缩小了存储空间,且采用二进制哈希编码计算复杂度更低。本项目将面向跨媒体检索,在以往相关工作的基础上,针对大数据特征空间所具有的多模态、高维非线性问题,研究新的非线性多模态哈希学习方法以及在具体的应用问题中所需的特定技术。具体研究目标包括:(1) 提出新的非线性多模态哈希学习算法,并利用局部学习,更好的建模大数据的复杂结构,并对算法的各项性能做出分析与证明; (2) 基于图像分布的流形特征和对有监督信息的主动选择,提出融入主动学习、流形学习的半监督多模态哈希学习方法; (3) 所提出的新方法将应用于大数据跨媒体检索系统,以期望提高这些系统的查询精度、召回率和效率。
中文关键词: 大规模机器学习;哈希学习;跨媒体检索
英文摘要: To confront with the difficulties of high-cost storage and low-efficiency retrieval for large-scale database, hashing based methods have attracted more and more attention and acceptance for people all over the world. The appealing property of hashing methods is they index data with binary hash codes which enjoy not only the compactness of the representation but also the low complexity in distance computation. Based on the recent related work on hashing, this project proposes to research on nonlinear multimodal hash function learning as well as their application in real-world cross media retrieval problems, to deal with the challenges of multimodal, high-dimensionlity, and nonlinear in feature spaces. More specifically, we aim to (1) propose novel nonlinear multimodal hashing approaches by fully use of local information to better modal the complex structure of big data, and analyze and verify their properties; (2) based on the manifold structure of image data and active selection for supervised information, propose semi-supervised multimodal hash learning incorporating with active learning and manifold learning; (3) apply the proposed algorithms to the real-world large scale cross media retrieval systems, in order to improve the query accuracy, recall and efficiency.
英文关键词: large-scale machine leanring;hash learning;cross-media retrieval