项目名称: 高维相关数据分析的关联结构研究
项目编号: No.61473302
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 其他
项目作者: 侯臣平
作者单位: 中国人民解放军国防科技大学
项目金额: 81万元
中文摘要: 关联结构是近年来统计数据分析和机器学习中出现的一个新的研究课题,它通过随机变量的变换直接描述相关性,为相关数据的分析提供了一条崭新的思路。本项目以高维相关数据分析为背景,针对关联结构的理论、方法和应用问题开展研究。理论上主要探讨关联结构描述高维相关数据相关性的机理以及关联结构中典型统计量的估计问题。方法上,一方面,研究基于维数约简的高维相关数据关联结构构建方法,另一方面,根据已构建的关联结构,研究基于关联结构的弱监督分类方法。应用中,依据关联结构,主要研究高维互联网网页文本数据筛选和生物图像数据分类两个具体问题。这三个部分紧密相连,理论和方法研究为应用研究提供指导,应用研究为理论和方法研究提供背景。课题研究不仅能够丰富和拓展机器学习的理论和方法,同时对于解决实际中许多具体应用问题也有重要的指导意义。
中文关键词: 高维相关数据;关联结构;统计分析
英文摘要: Copula is one of the most newest research topics in the fields of both statistical data analysis and machine learning. It uses random variable transformation to characterize the correlation directly and provides a new way in analyzing correlated data. This project aims at analyzing high dimensional correlated data by copula model from three aspects, i.e., in theory, methodology and application. In theoretical aspect, we will discuss the essence why copula can characterize correlations among high dimensional data. The estimation problem in copula model will also be investigated. In methdology, we will try to construct copula for high dimensional correlatd data based on dimensionality reduction strategy. Besides, by using previous constructed copula, we will also investigate the problem of classification in weak-supervised condition. In applications, according to our previous results on copula, we will dedicate to the problem of web text data filtering, together with the problem of biological image data classification. The above three aspects are tightly related. The researches in theory and methodology can provide useful guidance for applications. The researches in applications can provide real background for theoretical and methodological investigations. The works in this project can not only extend the research areas of traditional machine learning, but also provide meaningful guidance for the solution to problems in real applications.
英文关键词: high dimensional correlated data;copula;statistical analysis