Outliers contaminating data sets are a challenge to statistical estimators. Even a small fraction of outlying observations can heavily influence most classical statistical methods. In this paper we propose generalized spherical principal component analysis, a new robust version of principal component analysis that is based on the generalized spatial sign covariance matrix. Supporting theoretical properties of the proposed method including influence functions, breakdown values and asymptotic efficiencies are studied, and a simulation study is conducted to compare our new method to existing methods. We also propose an adjustment of the generalized spatial sign covariance matrix to achieve better Fisher consistency properties. We illustrate that generalized spherical principal component analysis, depending on a chosen radial function, has both great robustness and efficiency properties in addition to a low computational cost.
翻译:外部污染数据集是统计估计者面临的一个挑战,即使是一小部分外向观测也会严重影响大多数古典统计方法。在本文件中,我们提出通用球面主要组成部分分析,这是以通用空间标志共变矩阵为基础的新版主要组成部分分析,支持拟议方法的理论特性,包括影响功能、分解值和无干扰效率,并进行模拟研究,将我们的新方法与现有方法进行比较。我们还提议调整通用空间标志共变矩阵,以更好地实现渔业一致性特性。我们说明,通用球面主要组成部分分析,取决于所选的辐射函数,除了低计算成本外,还具有很强的稳健性和效率特性。</s>