We analyze the prediction error of principal component regression (PCR) and prove high probability bounds for the corresponding squared risk conditional on the design. Our results show that if an effective rank condition holds, then PCR performs comparably to the oracle method obtained by replacing empirical principal components by their population counterparts. On the other hand, if this condition is violated, then empirical eigenvalues start to have a significant upward bias, resulting in a self-induced regularization of PCR. Our approach relies on the behavior of empirical eigenvalues, empirical eigenvectors and the excess risk of principal component analysis in high dimensions.
翻译:我们分析主要成分回归(PCR)的预测误差,并证明相应的平方风险的高度概率界限取决于设计。我们的结果显示,如果有效的等级条件维持不变,那么PCR则与通过由人口对应方替换实验性主要成分而获得的甲骨文法相当。另一方面,如果这一条件被违反,经验性乙基值开始有明显的向上偏差,从而导致PCR的自发性正规化。我们的方法依赖于经验性乙基值、经验性精子以及主要成分高维度分析的过度风险。