项目名称: 基于图结构的迁移学习在文本倾向性分析中的应用研究
项目编号: No.61202254
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 计算机科学学科
项目作者: 孟佳娜
作者单位: 大连民族学院
项目金额: 22万元
中文摘要: 由于文本倾向性分析中数据同分布假设不成立会造成机器学习泛化能力降低,本项目拟利用基于图结构的迁移学习方法对此类问题进行研究。具体的研究工作包括:首先分析具有倾向性文本中特征的特点,利用图模型有效地表示出已标注文本、特征和未标注文本之间的关系,建立基本的用于迁移学习的图结构模型;在此基础上,融合基于特征和基于实例的迁移学习方法,提出基于图结构的知识传播迁移学习算法;结合半监督学习方法,提出基于图结构的协同迁移算法;最后,针对目标领域中的标注样本的不同情况,进行直推式迁移学习、归纳式迁移学习和无监督迁移学习模型的对比研究,以相关语料为应用背景验证模型的有效性。课题旨在以图结构为文本及其语义的基本表示模型,从迁移学习模型的建立入手,以文本倾向性分析为应用领域,提出有效的迁移学习方法,为迁移学习技术的进一步研究与应用提供新思路和理论依据。
中文关键词: 迁移学习;文本倾向性;图结构;直推式迁移学习;归纳式迁移学习
英文摘要: Since the generalization ability of machine learning may be decreased in text sentiment analysis when the data do not satisfy the identical distribution assumption, this project proposes a class of transfer learning methods based on the graph model to study this kind of problems. The detailed researches include: We analyze the characteristics of feature in the sentiment text, represent the relationship among the labelled text, the feature and the unlabelled text effectively, establish a basic graph model for transfer learning; On this basis, we fuse the feature-based and the instance-based transfer learning methods, and then propose a knowledge propagation transfer learning algorithm based on the graph model; We use the semi-supervised learning method to propose the co-transfer learning algorithm based on the graph model; At last, according to the different situations of the labelled instance in the target domain, we study the comparisons of the transductive transfer learning, the inductive transfer learning and the unsupervised transfer learning, and then verify the effectiveness of the model with related corpora. The project proposes a basic representation model of text and its semantic structure based on the graph model, starts with the foundation of transfer learning model, takes the text sentiment analysis
英文关键词: transfer learning;text sentiment;graph model;transductive transfer learning;inductive transfer learning