项目名称: 基于压缩域听觉谱的音频分类与检索算法研究
项目编号: No.60872115
项目类型: 面上项目
立项/批准年度: 2009
项目学科: 自动化技术、计算机技术
项目作者: 余小清
作者单位: 上海大学
项目金额: 26万元
中文摘要: 随着计算机处理能力提高、互联网发展和人们对音频信息需求量的增加,如何利用有效的方法对海量压缩格式音频数据进行快速、精确的分类检索引起了广大研究者的关注。项目组通过三年多研究和探索,系统构建了MP3压缩域听觉谱数学模型CASM,提出了一种仿人耳对音频信息处理的预处理机制,并基于MP3压缩域听觉谱进行了特征选择,利用基于熵的相似度度量方法研究了不确定性推理过程对音频分类和检索的影响,利用熵的方法对相似度进行了评价,利用模糊-粗糙近邻算法(FRNNC)对音频进行了分类,建立了快速精确的检索方法,得到了较好的实验结果,其方法不仅简化了压缩域音频分类检索的流程,同时也为在海量压缩音频数据中提取具有良好鲁棒性的压缩域音频特征提供了新的思路。项目组共获得发明专利1项,申请发明专利5项,发表论文48 篇,其中国际学术期刊7篇,SCI收录5篇,ISTP检索3篇,EI检索41篇,并多次参与国内外合作交流。
中文关键词: 压缩域;听觉谱;特征提取;分类检索
英文摘要: With the improvement of computer processing capacity, the development of the Internet and people's increasing demand for audio information, how to use effective methods to retrieve massive compression format music data has aroused the concern of the majority of researchers.We have systematically built the auditory spectrum mathematical model CASM of the MP3 compression domain, proposed an imitation of the human ear pre-processing of audio information.Based on the auditory spectrum of the MP3 compression domain,we conducted feature selection. By using the entropy method to evaluated the similarity, through fuzzy - rough nearest neighbor (FRNNC) to classify the audio ,and we all receive good experiment results. The method not only simplified the compressed domain audio classification retrieval process, but also provided new ideas to extract robust features. On this basis, we have applied for 5 patents,and successfully got 1 patents, published 48 papers, including seven international journals.Five papers were selected by SCI,three papers were cited by ISTP,forty-one papers were accepted by EI.We also participated in many international cooperations and exchanges.
英文关键词: Compressed domain; the auditory spectrum; feature extraction; classification and retrieval