项目名称: 面向健康管理数据的关联型知识深度挖掘方法研究
项目编号: No.61502014
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 自动化技术、计算机技术
项目作者: 许焱
作者单位: 北京大学
项目金额: 7万元
中文摘要: 从海量的健康管理数据中进行关联性知识挖掘是将信息转化为结构化知识的关键步骤,也为进一步的自动化知识推理和个性化健康诊疗提供了重要依据。传统的数据挖掘方法在处理海量数据时,健壮性和精确性都会受到很大影响,而深度学习方法能够克服这个难题,并已经在图像识别和语音识别领域被广泛应用。但在自然语言处理领域,文本数据的结构复杂性和语义多样性使得深度神经网络的设计依然是个巨大的挑战。 针对这些问题,在已有工作的基础上,本课题将以健康管理的海量数据为目标对象,研究用深度学习的方法从其中挖掘出关联性知识,具体包括:健康管理相关的关系类别定义;句子的解析树模型与图模型的转化;图模型的语义剪枝策略;深度神经网络的设计;基于深度神经网络的核技巧关系抽取;健康管理关联型知识封装。其中,重点解决的问题是基于剪枝生成子图的深度神经网络设计以及基于深度神经网路的核函数设计。
中文关键词: 关系抽取;健康管理;深度学习;循环神经网络;卷积神经网络
英文摘要: Mining association knowledge from big health care data is a key step for structuring information into knowledge, and provides import evidence for further automatic knowledge reasoning and personalized health clinics. Traditional data mining methods may reduce robustness and accuracy drastically when processing big data, while deep learning can overcome this obstacle and already be widely used in image recognition and speech recognition. However, in natural language processing domain, the structural complexity and semantic diversity of textual data makes it remains a tremendous challenge. In addressing these problems, this research object, based on existent works, explores deep learning method for mining association knowledge from the targeted health care big data. Main contents include: definition of relation categories related to health care; transformation between sentence parsing tree models and graph models; semantic pruning strategies of graph models; design of deep neural networks; deep neural network based kernel trick for relation extraction; encapsulation of health care association knowledge. Key issues are the design of deep neural networks based on pruned subgraphs and the design of kernel function based on deep neural work.
英文关键词: relation extraction;health management;deep learning;recurrent neural network;convolutional neural network