项目名称: 向量组合学习框架下基于依存混合树的中文语义解析研究
项目编号: No.61472191
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 计算机科学学科
项目作者: 周俊生
作者单位: 南京师范大学
项目金额: 78万元
中文摘要: 语义解析的目标是将自然语言形式的句子转换成一种完全形式化的意义表示,从而使得自然语言句子能被计算机自动理解和执行。本项目面向实现中文GIS自然语言接口的实际应用需求,针对现有的判别式语义解析模型中特征工程方法的不足,探索在深度学习的向量组合学习框架下基于特征向量的自动学习实现语义解析的新途径。为此,首先需要设计一种新的树型构造机制,使之既能灵活地桥接自然语言句子与形式化语义表示之间的对应性,又能反映句子的句法结构;然后视之为隐变量,使用一种基于向量组合计算和结构化预测的联合学习方法,通过综合利用隐变量中的句法结构信息和分布式的词、短语向量中语义信息,实现更有效的语义解析方法。主要研究内容包括:大规模中文语义解析语料库建设、作为隐变量的树型构造机制的设计、中文词向量学习模型的选择与设计、向量组合学习框架下多层神经网络的建模、相应的推导和学习算法的设计,以及在中文GIS系统中的实际测试与应用。
中文关键词: 语义解析;自然语言接口;深度学习;递归神经网络;依存混合树
英文摘要: Semantic parsing is the task of mapping a natural language sentence into a complete, formal meaning representation in a meaning representation language, which is a formal unambiguous language that allows for automated inference and processing. Considering the drawbacks of feature engineering methods in the discriminative models and the actual needs of implementing the natural language interfaces for the GIS systems, this project explores a new way of semantic parsing based on automatic feature learning under a new compositional vector framework in deep learning. To this end, we first need to design a new construction mechanism that can not only bridge the gap between the natural language sentence and the corresponding formal meaning representation, but also capture the syntactic information of the sentence. Then, by viewing the proposed construction mechanism as a latent variable, we exploit a deep learning model that jointly learn compositional vector representation and structural prediction to implement an effective semantic parsing system. The main research contents of this project are: the construction of a large-scale corpus for semantic parsing, the design of a new construction mechanism, the selection and design of neural network models for word vector learning, modeling the compositional vector learning using the multi-layer neural network and the design of the corresponding inference and learning algorithms, and the final application and testing of the proposed semantic parsing models and algorithms in actual GIS systems.
英文关键词: Semantic parsing;natural language interface;deep learning;recusive neural network;dependency-based hybrid tree