项目名称: 面向非线性非高斯数据的因果结构学习算法研究
项目编号: No.61305064
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 杨静
作者单位: 合肥工业大学
项目金额: 23万元
中文摘要: 从非线性非高斯的连续数据,挖掘数据蕴含的因果关系,是目前数据挖掘领域新兴的研究热点,计算复杂度较大是目前学习面临的重要问题。本项目拟基于局部学习理论进行研究,以期降低学习的复杂度。首先,针对线性的非高斯数据,进一步探索偏相关系数的分布规律,构建基于假设检验的相关性度量,结合局部学习策略,构建快速有效的因果结构学习算法。然后,针对非线性非高斯数据,探索基于联立方程模型和多项式拟合理论对数据对象进行描述,进而建立方程系数与相关性之间的关联,最后融合局部学习策略,构造快速有效的因果结构学习算法。为了处理高维的非线性非高斯大数据,探索基于流特征的在线因果结构学习框架,进而探索非线性非高斯条件独立型测试的标准,构造在线的马尔可夫毯的更新方法,融合局部学习思想,构建在线的因果结构调整方法,最终实现快速有效的基于流特征的在线结构学习算法模型。研究成果可以为非线性非高斯数据因果发现奠定理论和方法基础。
中文关键词: 结构学习;非线性;非高斯;局部学习;因果发现
英文摘要: Mining implied causal relationship from the non-linear non-Gaussian continuous data, is an emerging research hotspot in data mining, and an important issue is how to deal with the high computational complexity problem at present. This project plans to carry out research based on local learning theory for the purpose of reducing the learning complexity. First, for linear non-Gaussian data, we will further explore the distribution law of the partial correlation coefficient, investigate the correlation measurement based on hypothesis testing, and construct fast and effective causal structure learning algorithms via local learning strategy. Second, in terms of non-linear non-Gaussian data, we plan to study the simultaneous equations model and polynomial approximation theory and apply it to describe the data object, and then establish association between the equation coefficients and correlation, and integrate the local learning strategy to build fast and efficient causal structure learning algorithms. In order to deal with high-dimensional non-linear non-Gaussian data, we will investigate the online causal structure learning framework based on streaming feature, explore the non-linear non-Gaussian conditional independence test criterion, put forward Markov blanket online updating method and online causal structu
英文关键词: structure learning;non-linear;non-Gaussion;local learning;causal discovery