面向大数据深度分析的马尔科夫逻辑理论与算法研究

项目名称： 面向大数据深度分析的马尔科夫逻辑理论与算法研究

项目编号： No.61303179

项目类型： 青年科学基金项目

立项/批准年度： 2014

项目学科： 自动化技术、计算机技术

项目作者： 孙正雅

作者单位： 中国科学院自动化研究所

项目金额： 23万元

中文摘要： 马尔可夫逻辑作为一阶逻辑和概率图模型的充分结合，被视为数据深度分析的最重要技术手段之一，然而在该框架下所开发的大多数算法不具备良好的可扩展性。为了提升从大数据中获取知识和洞见的能力，本项目以马尔可夫逻辑为基础理论框架，拟从特征表示、参数优化以及增量学习系统搭建三个方面系统研究大数据深度分析技术。首先针对大数据类型多样化以及关系复杂化，拟借助预测聚类树和频繁序列模式挖掘思想，研究面向关系n元组的分层概念学习，在此基础上提出新颖的结构学习算法实现异质关系的路径搜索以及逻辑规则的自动构建。其次针对大数据规模庞大，拟借助深层和积网络理论，发展新的在线并行优化算法，实现不确定规则参数化学习。最后为了适应新增数据的不断涌现，在特征表示和参数优化中充分结合增量学习思想，搭建应用于大数据深度分析的增量学习系统。实现从大数据中迅速而准确地获取深层次语义信息，有助于促成科学预见性的决策和判断。

中文关键词： 大数据；统计关系学习；马尔可夫逻辑；和积网络；增量学习

英文摘要： Markov logic has been regarded as one of the most important tools for deep data analysis due to its full expressiveness of probabilistic graphical models and first-order logic. However, as we enter the "big data" era, the ever rising scale of the data makes progress in this paradigm increasingly difficult. To enhance the ability to acquire knowledge and insights from big data, this project conducts a systematic study on the framework of Markov logic from three aspects, including feature representation, parameter optimization and incremental learning system building. For the varied types and complex relations of big data, we first develop an effective hierarchical conceptualization algorithm for relational n-tuples by introducing the idea of predictive clustering trees and frequent sequential pattern mining. On this basis, a novel structure learning algorithm is designed to find paths between heterogeneous relations and automatically construct formulas. Furthermore, we introduce deep sum-product networks to address parameter learning for the large scale data, in which new online parallel optimization strategies are devised. Faced with the emergence of massive new data, we finally investigate feature representation and parameter optimization from incremental learning view, and build an integrated system for in-dep

英文关键词： Big Data；Statistical Relational Learning；Markov Logic；Sum-Product Networks；Incremental Learning

成为VIP会员查看完整内容

相关内容

大数据

关注 270

从各种各样类型的数据中，快速获得有价值信息的能力，就是大数据技术。明白这一点至关重要，也正是这一点促使该技术具备走向众多企业的潜力。大数据的4个“V”，或者说特点有四个层面：第一，数据体量巨大。从TB级别，跃升到PB级别；第二，数据类型繁多。前文提到的网络日志、视频、图片、地理位置信息等等。第三，价值密度低。以视频为例，连续不间断监控过程中，可能有用的数据仅仅有一两秒。第四，处理速度快。

空间数据智能：概念、技术与挑战

专知会员服务

91+阅读 · 2022年2月3日

面向大数据处理框架的JVM优化技术综述

专知会员服务

17+阅读 · 2021年11月27日

Python数据结构与算法，540页pdf

专知会员服务

113+阅读 · 2021年9月22日

算法分析导论, 593页pdf

专知会员服务

151+阅读 · 2021年8月30日