项目名称: 基于统计的维吾尔语依存句法分析若干关键技术研究
项目编号: No.61262061
项目类型: 地区科学基金项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 麦热哈巴·艾力
作者单位: 新疆大学
项目金额: 43万元
中文摘要: 依存句法分析是自然语言处理中很重要的一个研究内容。国内外已有许多研究人员在这方面做了大量的研究,并对语言的深层分析提供了理论基础与技术。维吾尔语句法分析研究才刚起步,主要对短语结构语法体系方面进行了探讨,但还没有涉及到依存句法分析的研究。维吾尔语的特殊性- - -维吾尔语语属于SOV结构,它是典型的黏着性语言,其形态变化丰富、派生能力极强、词尾数量很多,词尾不仅使词干具有新的语法功能,同时搭载着一定的语义信息等- - -无疑给维吾尔语的依存句法分析带来一定的困难。本项目以为进一步加深维吾尔语依存句法分析和维吾尔语语义分析提供研究基础为初衷,着重研究维吾尔语依存句法分析中关键的几个内容,即包括依存单元和依存关系的确定;短语树库与依存树库的转换以及几种常用的依存句法分析统计模型的建立等。最终构建规模至少为2万句的维吾尔语依存树库以及提出适合于维吾尔语言特性的基于统计的依存句法分析方法。
中文关键词: 依存语法;依存树库;维吾尔语;句法分析;依存句法分析
英文摘要: Dependency parsing is a very important part of natural language processing, also is the basis of semantic analysis. Researchers in domestic and overseas have made lots of researches about it ,and have used the results on the other fields. it is promoted the information processing ability of these languages. The Uyghur parsing is just getting started, but it uses phrase based grammer system, dependency parsing research is still in the undeveloed state. Obviously, dependency parsing in Uyghur plays irrepalceble role in the information processing of the Uyghur.On the other hand, Uyghur is a special language- - -it is an agglutinative language where a sequance of inflectional and derivational morphemes get affixed to a root. At syntax level, the constituend orde is SOV- - -undoubtedly, it brings some difficulties to Uyghur dependency parsing.With the intention of serving for futher Uyghur dependency parsing and semantic analysis,the subject focus on studing several important contents ,including the determination of depending unit and dependent types; the method of conversing the phrase based treebank to dependency treebank and the establishment of several common dependency parsing algorythm based on statistical model. Our goal is to build an Uyghur dependency treebank which scale is at least 20,000, and develop Uyg
英文关键词: dependency grammar;dependency treebank;Uyghur;parsing;dependency parsing