Dimension reduction plays a pivotal role in analysing high-dimensional data. However, observations with missing values present serious difficulties in directly applying standard dimension reduction techniques. As a large number of dimension reduction approaches are based on the Gram matrix, we first investigate the effects of missingness on dimension reduction by studying the statistical properties of the Gram matrix with or without missingness, and then we present a bias-corrected Gram matrix with nice statistical properties under heterogeneous missingness. Extensive empirical results, on both simulated and publicly available real datasets, show that the proposed unbiased Gram matrix can significantly improve a broad spectrum of representative dimension reduction approaches.
翻译:减少维度在分析高维度数据方面发挥着关键作用,然而,缺少值的观测在直接应用标准维度减少技术方面造成了严重困难。由于大量维度减少方法以格拉姆矩阵为基础,我们首先通过研究格拉姆矩阵的统计属性,研究克拉姆矩阵的统计属性,不论是否缺失,来调查缺失程度减少对维度减少的影响,然后我们提出一个有偏向修正的格拉姆矩阵,在差异性缺失的情况下,具有良好的统计属性。模拟和公开提供的真实数据集的广泛经验结果表明,拟议的不带偏见的格拉姆矩阵可以大大改进具有代表性的减少维度方法的广泛范围。