项目名称: Spiked模型中特征值和特征向量的理论分析与推断
项目编号: No.11201175
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 数理科学和化学
项目作者: 丁雪
作者单位: 吉林大学
项目金额: 22万元
中文摘要: 高维数据处理是现代统计学中重要的研究方向之一.无线电通讯、经济和语音识别等领域观测到的高维数据常包含大量噪声和少数主成分.由于维数的影响,经典的统计方法在处理这类高维数据时常得出不准确的结论.在探索这类高维数处理方法的过程中,spiked模型应运而生,模型假设样本维数与样本个数都趋于无穷,总体协方差阵有少数几个离群特征值.分析spiked模型中特征值与特征向量的性质对高维数据处理至关重要,据此可以估计和检验无线电通讯中的信号个数,经济数据中潜在的因子个数以及对基因数据进行分组等.本项目拟对spiked模型样本特征值和特征向量的性质进行研究进而对总体特征值和特征向量做出推断,以求更有效地分析和处理高维数据.
中文关键词: spkied模型;特征值;特征向量;样本协方差阵;总体协方差阵
英文摘要: The high dimensional data analysis is one of the most important research areas in modern statistics. The high dimensional data observed in wireless communication, economic and speech recognition often contains a lot of noises and a small number of principal components. Due to the influence of the dimension of the observations, the classical statistical methods always lead to inaccurate inference in dealing with such high dimensional data. During the explotation of finding appropriate methods to analyze this kind of data, the spiked population model emerged. In a spiked population model, the population covariance matrix has a few eigenvalues well separated from others and both the dimension of observations and the sample size tend to infinity. It is of great importance to investigate the properties of eigenvalues and eigenvectors of the spiked population model for high dimensional data analysis because one can estimate the number of signals in wireless communications and potential factors in economic dataset and classify genetic data from these information. Thus in order to more efficiently analyze the high dimensional data, in this project we are going to study properties of the sample eigenvalues and eigenvectors and make inference about the population eigenvalues and eigenvectors for the spiked population m
英文关键词: spiked model;eigenvalue;eigenvector;sample covariance matrix;population covaraince matrix