项目名称: 一种“统计+结构”机器学习理论与方法研究
项目编号: No.61472423
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 其他
项目作者: 王珏
作者单位: 中国科学院自动化研究所
项目金额: 83万元
中文摘要: 随着大数据时代到来,人们对海量异构数据语义理解的需求日益凸显,机器学习已成为语义理解和知识获取的关键。现有统计机器学习强调排中律而破缺因果律,而符号机器学习强调因果律而破缺排中律,探索一种折衷的理念,形成一套新的机器学习理论,成为当前机器学习研究的热点。本项目旨在提出一种语义概率图模型,从知识表示理论与模型、关键技术、实例验证三方面开展统计+结构机器学习研究。首先,研究统计+结构知识表示理论,提出基于语义三元组与语义概率图模型,为复杂问题表示与求解提供理论与模型基础;其次,研究语义概率图上知识表示与知识簇提炼、深度学习的结构发现、模型参数自适应学习、精确与近似推理关键技术,实现模型的生成与推理;最后,结合承担的京东商城网络交易和物流配送大型网络数据处理任务,验证提出的理论与关键技术。通过上述研究,探索传统的统计机器学习与符号机器学习的契合点,对大数据机器学习理论和应用具有重要意义。
中文关键词: 机器学习;知识表示;知识推理;语义概率图模型;统计+结构
英文摘要: With the arrival of the era of big data, the need for semantic understanding of vast amounts of heterogeneous data has become increasingly prominent, machine learning has become the key technology of semantic understanding and knowledge acquisition. Existing statistical machine learning approaches emphasize the Law of Excluded Middle and break the law of causality, while the symbolic machine learning emphasizes the causality and breaks the Law of Excluded Middle. To explore an idea of compromised, forming a new machine learning theory, is the hot spot in the current machine learning research field. This project aims to propose a kind of semantic probabilistic graphical models, we will do research in Statistics + Structure based machine learning field, knowledge representation theory and models, key technologies, typical examples for verification etc. First, study the Statistics + Structure based knowledge representation theory, and propose a kind of semantics probabilistic graphical model based on semantic triples,as the theoretical basis of complex problem representation and solution. Secondly, study the following key technologies for model generation and reasoning: knowledge representation and knowledge cluster refining,structure learning based Deep Learning, adaptive learning of model parameters, accurate and approximate inference etc. Finally, combined with the JingDong E-commerce trading record and and logistics and distribution data processing task, verify the proposed theories and key technologies. Through these researches above, to explore the meeting point of traditional Statistical Machine Learning and Symbolic Machine Learning, has great significance to satisfy the needs of big data Machine Learning Theory and application developments.
英文关键词: Machine learning;Knowledge representation;Knowledge inference;Semantic probabilistic graphical model;Statistic +Structure