Principal component analysis is a versatile tool to reduce dimensionality which has wide applications in statistics and machine learning. It is particularly useful for modeling data in high-dimensional scenarios where the number of variables $p$ is comparable to, or much larger than the sample size $n$. Despite an extensive literature on this topic, researchers have focused on modeling static principal eigenvectors, which are not suitable for stochastic processes that are dynamic in nature. To characterize the change in the entire course of high-dimensional data collection, we propose a unified framework to directly estimate dynamic eigenvectors of covariance matrices. Specifically, we formulate an optimization problem by combining the local linear smoothing and regularization penalty together with the orthogonality constraint, which can be effectively solved by manifold optimization algorithms. We show that our method is suitable for high-dimensional data observed under both common and irregular designs, and theoretical properties of the estimators are investigated under $l_q (0 \leq q \leq 1)$ sparsity. Extensive experiments demonstrate the effectiveness of the proposed method in both simulated and real data examples.
翻译:主要元件分析是减少维度的多用途工具,在统计和机器学习中具有广泛应用性,对于在高维情景下模拟数据特别有用,因为在高维情景中,变量数量与美元比较或大大大于美元样本规模。尽管有关于这个专题的广泛文献,研究人员仍然侧重于对静态主要源体进行建模,这些元件不适合具有动态性质的随机过程。为了说明整个高维数据收集过程的变化,我们提议了一个统一框架,直接估计共变矩阵的动态源体。具体地说,我们通过将局部线性平滑和正规化处罚与孔度约束结合起来来形成优化问题,这可以通过多种优化算法来有效解决。我们表明,我们的方法适合在普通和非正规设计下观测的高维数据,以及估测器的理论特性都是在$_q(0\leqq q\leq)/leq1美元下调查的。在模拟和真实数据示例中都展示了拟议方法的有效性。