项目名称: 汉语依存句法分析若干关键技术研究
项目编号: No.60803093
项目类型: 青年科学基金项目
立项/批准年度: 2009
项目学科: 武器工业
项目作者: 车万翔
作者单位: 哈尔滨工业大学
项目金额: 19万元
中文摘要: 句法分析是自然语言处理的核心问题,对信息抽取、机器翻译等应用有重要的支撑作用。依存语法以其形式简洁、易于标注、便于应用等优点,逐渐受到重视。虽然目前汉语依存句法分析研究取得了一定的进展,但是其准确率和效率仍然不能满足实际应用的需要。本项目针对汉语的特点以及汉语句法分析的难点,面向实际应用,从下5个方面对汉语依存句法分析技术进行了研究:1、对比了基于转移和图的模型在汉语上的性能;2、提出了基于柱搜索的高阶依存分析模型,并参加了CoNLL2009依存句法和语义分析联合评测,取得了第一名的成绩;3、探索了基于图和基于转移的融合模型,进一步提升了句法分析的准确率;4、提出了基于片段的两阶段汉语依存分析方法,大幅度提高了汉语句法分析效率;5、词性标注与句法分析的联合学习模型,一定程度上克服了汉语词性标注准确率低对句法分析的影响。项目负责人所在团队因其开发的"语言技术平台(LTP)"获钱伟长中文信息处理科学技术奖一等奖,其中汉语依存句法分析是该平台的核心系统,项目负责人本人也因此获汉王青年创新奖一等奖。
中文关键词: 依存句法分析;柱搜索;CoNLL;
英文摘要: The syntactic parsing is the core issue of natural language processing. It can support lot of applications, such as information extraction, the machine translation, etc. The dependency syntactic parsing, with its simple grammatical form, easy-tagging, and facilitate applications, is paid gradually attention. Although the Chinese dependency parsing has made some progress recently, its accuracy and efficiency are still unable to meet the needs of practical application. This project addresses the characteristic and difficulties of Chinese language. It faces the practical applications and tries the following five aspects to improve the Chinese syntactic parsing. 1. Comparing the Transition-based and Graph-based dependency models on Chinese data set. 2. Proposing beam-search based high-order dependency parsing model. We participanted the CoNLL 2009 dependency syntacitic and semantic parsing shared task and achieved the first place. 3. Joining Graph-based and Transition-based dependency parsing model to improve the accuracy further. 4. Providing fragment-based two stages Chinese dependency parsing model to improving the effecience. 5. Joining Chinese POS tagging and dependency parsing to overcome the lower accuracy of Chinese POS tagging. The team of project leader obtained "Weichang Qian Chinese Information Processing Science and Technology Award", because of their development of the Language Technology Platform (LTP). The Chinese syntactic dependency parser is the most important system in LTP. The project leader was awarded the "HanWang Young Innovation Award".
英文关键词: dependency parsing; beam-search; CoNLL;