项目名称: 中文情感资源自动构建的关键技术研究
项目编号: No.61300156
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 徐戈
作者单位: 闽江学院
项目金额: 23万元
中文摘要: 情感分析是目前计算语言学领域的研究热点,而情感资源是支撑情感分析的基础。目前,中文情感资源相对匮乏,质量也不高;此外,当领域变化后,已有的资源往往不够充分甚至不适用。本项目旨在从大规模无标注语料中快速自动构建质量高、覆盖度广的情感资源,将从如下四个方面展开研究:(1)提出适合文本自动处理的情感模型,能覆盖各种情感粒度、情感特性;采用判断题形式(从语料中自动挖掘)设计情感标注规范,使得标注过程清晰、可操作,标注结果一致性强。(2)提出基于非对称性的极性判断方法,能使用无标注语料自动判别观点的极性。该方法还能考虑上下文信息,从而更加准确地判断极性。(3)构建汉字情感资源。此外,结合构词法的研究,提供汉字和单词之间的情感传播方案,应用于统计信息较少时的单词情感推断。(4)采用序列挖掘算法抽取出典型的极性偏移模式(否定、转折等)。
中文关键词: 情感分析;资源自动构建;极性非对称性;汉字情感;极性偏移
英文摘要: Sentiment analysis is the hotspot of computational linguistics in recent years, and sentiment resource is the basis on which sentiment analysis is developed. At present, Chinese sentiment resource is inadequate, and the quality is supposed to be better. Furthermore, for a new domain, existing sentiment resource is often not enough and even not applicable. This project aims at automatically building Chinese sentiment resource of high quality and of high coverage from any large-scale unlabeled corpus. We will research on four issues:(1) Propose a sentiment model for automatic processing, which can covers a variety of sentiment categories, sentiment characteristics; use True/False judgment questions to design sentiment annotation specification, making the annotating clear, operable, and resulting in high consistency. (2) Propose a method for polarity classification based on asymmetry, which can identify the polarity of an opinion using unlabeled corpus. This method can also utilize contextual information to improve polarity classification.(3) Provide the sentiment propagation strategy between Chinese characters and words based on word formation analysis, which are particularly effective in the lack of statistical information. (4)Extract typical polarity shifting(negation,adversative conjunction etc.) patterns from
英文关键词: sentiment analysis;automatic resource building;polarity asymmetry;Chinese character sentiment;polarity shifiting