项目名称: 面向海量视频库的分布式拷贝检测
项目编号: No.61303159
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 顾晓光
作者单位: 中国科学院计算技术研究所
项目金额: 28万元
中文摘要: 以视频分享为代表的互联网视频应用促成了海量视频资源的产生和快速传播。随之而至的是对互联网视频进行监管、检索、版权保护的强烈需求。面向海量视频库的视频拷贝检测技术是实现此类应用的关键技术。但是现有研究成果还很难在海量视频库上实现有效的检测,其主要问题在于现有算法的可扩展性不强。本项目拟通过对分布式视频拷贝检测的研究来解决上述问题。为了合理实现高维索引结构的分布式存储和分布式查询,拟提出新的分布式高维索引构建模型,以保证分布式节点间的负载均衡,实现数据规模的最大化。为了提高检测的速度,拟提出面向分布式高维索引的优化查询算法,通过离线规划多探测序列和在子空间内生成最优化二进制码,实现高效的相似性查询。为了降低查询负载,拟提出基于无监督学习的关键局部特征点挖掘算法,以缩小特征库规模,减少在线提取特征的数量和计算复杂度。通过本项目的研究,拟突破海量视频库上的拷贝检测问题,为拓展新的应用奠定基础。
中文关键词: 视频拷贝检测;分布式计算;高维索引;局部特征;局部敏感哈希
英文摘要: With the development of many innovative Internet applications, such as Video Sharing, a very large scale video resource has been accumulated. The sudden increase in the volume of video that can be shared by users on the Internet, highlights the importance of developing more efficient applications to supervise,track and filter illegal video content. Content-based Copy Detection for mass video is the key technology for these applications. However, the existing research work can't do effective detection on so large video library. The main difficulty is that the scalability of the existing algorithm can't meet the rapid growth of the data scale. We propose Distributed Content-based Video Copy Detection framework to resolve the problem. Three key problems will be researched. First, a novel theoretical model for constructing Distributed High-Dimensional Indexing is proposed. The proposed theoretical model can guarantee the optimal load balance between all distributed nodes. This is the base for maximizing the data scale and data processing ability. Second, an optimal search algorithm for Distributed High-Dimensional Indexing is proposed for boosting the searching speed. The proposed algorithm generates the optimal multi-probe sequence offline and hashes the neighbor high-dimensional vectors in the same subspace to opt
英文关键词: Video Copy Detection;Distributed Computing;High-Dimensional Indexing;Locality Sensitive Hashing;Local Feature