项目名称: 音频指纹在音乐检索中的关键技术研究
项目编号: No.60873255
项目类型: 面上项目
立项/批准年度: 2009
项目学科: 自动化技术、计算机技术
项目作者: 李伟
作者单位: 复旦大学
项目金额: 26万元
中文摘要: 互联网上的海量音乐信息促使产生了进行音乐自动匹配的数字音频指纹技术,但是现阶段该技术的典型算法与人耳识别功能相比仍然存在巨大缺陷。本课题主要贡献是设计了以下三种鲁棒音频特征在严重失真环境下进行音乐识别:(1)在音乐语谱图上计算SIFT描述子作为鲁棒音频特征,在检索片段被严重时间伸缩或变调时仍然能以80%以上的准确率识别数据库中的原始版本;(2)在MP3压缩域半解压状态分别计算MDCT频谱熵和听觉图像上的Zernike矩作为鲁棒音频特征,对一般音频信号处理得到了很强的鲁棒性。三年时间中本研究完全达到了预期目标,在鲁棒音频识别技术方面取得了突出成绩,共发表论文14篇,其中在国际顶级会议ACM MM和ACM SIGIR上发表全文和短文共5篇,EI检索国际重要会议1篇,国内权威期刊1篇,国内半权威学报2篇,核心期刊5篇。此外申请专利1项,毕业研硕士究生3人,获得上海市自然科学二等奖一项排名第三。
中文关键词: 音频识别;鲁棒性;语谱图;音频特征
英文摘要: Recently, numerous music on the Internet has given rise to the technique called "Audio Fingerprinting" which is typically used for automatic music identification. The main contributions of this research are three novel audio features designed for audio identification, i.e. compressed-domain Zernike moment of auditory image, compressed-domain MDCT spectral entropy, and SIFT descriptor of audio spectrogram. The three identification algorithms are rather robust against common signal processing and synchronization distortions, with pretty high identification precision. In this research field, we published 14 technical papers in the past three years, of which five were published in international top conferences as full paper and short papers. Besides, we have applied for one domestic patent, and obtained a second-class natural science prize of Shanghai City. Under the support of this grant, three students graduated and obtained their master degree successfully.
英文关键词: Audio identification; Robustness; Audio feature; Audio spectrogram