项目名称: 汉语关联结构的资源建设和自动分析模型研究
项目编号: No.61202193
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 计算机科学学科
项目作者: 陈波
作者单位: 湖北文理学院
项目金额: 28万元
中文摘要: 汉语的关联结构(由连词构成的复句)包含着丰富而复杂的语义信息。长期以来中文信息处理比较专注于单句的表示和分析,复句的建模和分析处于较忽略的地位。本项目旨在建立一个完整的汉语关联词语本体,研究关联结构的语义依存结构,提出基于语义依存图(有向图)的表示机制,建设大规模标注资源并探讨基于判别性模型的分析策略。语义依存结构跳脱句法依存的限制,允许多父节点和交叉依存。所建资源包含一个汉语关联词语本体和2万个从真实语料中选取的例句,基于对数线性的二阶段区分性分析模型用以分析关联结构的语义依存,其特征设计可刻划局部和全局性的结构化信息。本项目有助于探讨适合汉语实际特点的语义描写机制,丰富汉语语义资源和语义分析策略,对提高汉语自动分析、文本蕴含、信息抽取和篇章理解等技术的性能有一定意义。
中文关键词: 关联结构;语义依存;语义资源;区分性模型;
英文摘要: Chinese connective structure contains rich and complex semantic information. For a long time, Chinese information processing focus on the representation and analysis of single sentence, and neglects modeling and analysis of complex sentences. The project aims to establish a Chinese connectives ontology, study semantic dependency structure of connectives, propose mechanism based on the semantic dependency graph, build large-scale annotation resources and explore the analysis strategy based on discriminative model. Semantic dependency structure avoids the limitations of syntactic dependency, allowing multi-parent node and the cross dependency relations. We will built a Chinese connectives ontology and 20000 complex sentences from a real corpus. The feature design can characterize structured information. This project will help to explore the semantic representation mechanism for actual Chinese characteristics, enrich Chinese semantic resources and semantic analysis strategy, and improve the performance of Chinese automatic parsing, textual entailment, information extraction and discourse understanding.
英文关键词: Connectives structure;semantic dependency;semantic resource;discriminative model;