项目名称: 基于内在与潜在语义特征的声音段落级语义识别方法研究
项目编号: No.61471145
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 无线电电子学、电信技术
项目作者: 韩纪庆
作者单位: 哈尔滨工业大学
项目金额: 86万元
中文摘要: 非语音声音的语义识别是声音感知与理解研究的核心内容之一。然而目前这方面的研究工作,无论是声学事件检测还是计算场景分析大多只关注于声音中局部声学对象的低层语义识别问题,缺乏对声音段落(本项目中指一定时长的声音)整体语义的识别研究。声音段落级语义识别是一个新兴的研究方向,还有许多问题亟待解决。本项目基于声音的内在和潜在语义特征来识别声音段落的整体语义。其中,内在语义是指可以直接依据声音段落本身内容来获得的语义,潜在语义是指必须借助人类经验知识从若干相近声音段落中凝练出的抽象语义。项目的主要研究内容包括:适合声音段落语义特征表示与提取的码本构建与优化、声音段落的内在与潜在语义特征提取、能提供更多语义识别先验知识的声音背景信息提取,以及结合上述两类语义特征和先验知识的声音段落整体语义识别。本项目的研究工作对提高计算机声音的认知能力,进而推动其走向现实应用具有重要的理论意义和实用价值。
中文关键词: 声音段落;语义分析;内在语义;潜在语义
英文摘要: The semantic recognition of a sound (non-speech) is one of the core contents of the research on perceiving and understanding a sound. However, most researches on computational auditory scene analysis and acoustic event detection only focus on the recognition of local acoustic objects in the low-level semantic, and rarely explore on the recognition of a whole sound segment, which refers to a sound with the given duration in this project, in a global semantic level. As a novel research, the segment-level semantic recognition of a sound has many difficult problems which require to be solved. In this project, the segment-level semantic recognition is carried out based on the extraction of the internal and latent semantic features. The internal semantic can be directly obtained from the context of a sound segment, and the latent semantic can be obtained based on extracting the abstract information from several similar sound segments labled with human experience and knowledge. The main work in this project includes constructing and optimizing a suitable codebook for representing semantic feature well, extracting the internal and latent semantic features respectively, providing more available prior knowledge based on the background information of a sound segment, and recognizing the whole semantic from a sound segment by comprehensively utilizing the above two types of features and the prior knowledge. This research not only has important theoretical significance but also has practical value in improving cognitive ability of sound for computers and in promoting their real applications.
英文关键词: Sound Segment;Semantic Analysis;Internal Semantic;Latent Semantic