项目名称: 构式语法的计算模型研究
项目编号: No.61473101
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 自动化技术、计算机技术
项目作者: 陈清财
作者单位: 哈尔滨工业大学
项目金额: 80万元
中文摘要: 构式语法作为具有很强语言现象解释能力的认知语法体系,虽然在语言学理论上已经较为成熟,但由于缺乏可计算的形式化定义、基础资源严重缺乏、计算机制尚不明确等关键问题,难以有效支撑自然语言处理及其应用技术的发展。为此,本项目致力于为语言学与自然语言处理之间搭建起构式计算的有效桥梁:通过对从语言学角度阐述的认知构式语法探索具有可计算性的形式化表示模型,构建具有构式定义、构式语料库标注、构式可视化表示和构式分析的开放性平台系统,解决构式研究的基础资源和工具缺乏问题,在此基础上,以典型构式为出发点,基于深度学习等自然语言处理热点技术来开展构式语法的自动分析与标注、构式量化表示的学习模型机构式应用等构式计算的方法与机制研究,为构式语法的计算模型构建与应用建立初步的理论与实践基础,为构式语法的发展与推广即自然语言处理技术的发展做出积极贡献。
中文关键词: 构式语法;可计算模型;语义分析;自然语言处理;深度学习
英文摘要: Though construction Grammar (CG), with strong explanatory power for language phenomena and learning, has been a matural liunguistics theory,it is still a stranger for most of NLP tasks. The main reasons include the absent of compuation oriented formal representation, the lack of a large scale construction knowledge base and its construction tool, and the missing of practice experience for CG to be applied in NLP, which are big obstacles for effectively applying CG in addressing NLP tasks. In this project, we are trying to build a construction computing bridge between linguistics and NLP: at first, according to the linguistics definition of constructions and their propterties, build the the computation oriented formal representation of contruction grammar. Then we will develop the open platform with functions of new construction defintion, corpra annotation with contructions and annotation result visialization ect to broke the bottlenect of lacking basis construction resources and tools. Based on it, we research on the models and methodology of parsing, anto annotation and quantity representation learning of contruction grammars by introducing deep learning, word embedding learning etc. NLP techniques, which fulfills us to the compuational modelling of construction grammars, and builds the foundation of applying construction grammars in real NLP tasks. Via this project, our goal is to provide NLP researchers and engineers usable tool and models for constructionb based NLP techniques, and to deliver contributions to both the improvement of construction grammar theory the development of NLP techniques.
英文关键词: Construction Grammar;Computational Model;Semantic Analysis;Natural Language Processing;Deep Learning