项目名称: 汉语依存分析的概率化决策动作模型及自适应技术研究
项目编号: No.60875041
项目类型: 面上项目
立项/批准年度: 2009
项目学科: 矿业工程
项目作者: 赵军
作者单位: 中国科学院自动化研究所
项目金额: 35万元
中文摘要: 依存句法分析是自然语言处理的重要任务之一。在机器翻译、自动问答、信息提取等应用系统中, 依存句法分析能够为各系统提供句子结构上的信息。本课题研究依存分析的概率化决策动作模型及自适应技术研究,主要研究内容包括:网络挖掘辅助的依存句法分析方法、融合层次化语义知识的汉语依存句法分析方法、基于多种树库知识迁移的依存句法分析方法、依存句法分析在自然语言中的应用(社区问答系统关键技术、基于知识关联的实体排歧与属性抽取技术、网络挖掘辅助的机构名翻译抽取技术、文本倾向性分析中领域自适应技术等)。主要研究成果包括:30篇学术论文,其中包括国际顶级会议论文11篇,授权国家发明专利两项,申请国家软件著作权登记两项, 获得国际学术奖励一项(ACM KDD-CUP),作为中国计算机学会《学科前沿讲习班》第二十一期"面向互联网的自然语言处理技术"特邀讲师为来自全国的青年学者和学生讲述了依存句法分析、信息抽取、观点挖掘和倾向性分析、问答系统等内容,培养博士硕士研究生多名。
中文关键词: 自然语言处理;依存句法分析;结构化机器学习
英文摘要: Dependnecy parsing is one of the important tasks in the area of natural language processing (NLP). It can provide structural information of sentences in various kinds of application systems such as machine translation, question answering and information extraction. This project focuses on probabilistic models for action-based Chinese dependency parsing and domain adaptation, including (1) Web mining assisted statistical dependency parsing; (2) Incorporating hierarchical semantic knowledge for Chinese dependency parsing; (3) Knowledge transfer based on multiple treebanks for dependency parsing; (4) Dependency parsing in several kinds of natural language processing (research on community question answering; Named entity disambiguation and attribute extraction based on concept association; Organization name translation with the assistance of web information; domain adaptation for opinion extraction and sentiment analysis). The main achievements include: 30 papers, where 11 papers in the top-rank international conferences; 2 national invention patents, 2 software patent registrations, one international award (ACM KDD_CUP). The related achievements receive national and foreign experts' high quality evaluation, and instrongly promote the development of the research in this field.
英文关键词: Natural language processing; dependency parsing; structured machine learning