Invariant Coordinate Selection (ICS) is a multivariate data transformation and a dimension reduction method that can be useful in many different contexts. It can be used for outlier detection or cluster identification, and can be seen as an independent component or a non-Gaussian component analysis method. The usual implementation of ICS is based on a joint diagonalization of two scatter matrices, and may be numerically unstable in some ill-conditioned situations. We focus on one-step M-scatter matrices and propose a new implementation of ICS based on a pivoted QR factorization of the centered data set. This factorization avoids the direct computation of the scatter matrices and their inverse and brings numerical stability to the algorithm. Furthermore, the row and column pivoting leads to a rank revealing procedure that allows computation of ICS when the scatter matrices are not full rank. Several artificial and real data sets illustrate the interest of using the new implementation compared to the original one.
翻译:变化式协调选择是一种多变量数据转换和尺寸减少方法,在许多不同情况下都是有益的。它可以用来进行异常检测或群集识别,可以被视为独立的组成部分或非高加索组成部分分析方法。ICS通常采用两个散射矩阵的对齐法,在某些条件不成熟的情况下,可能在数字上不稳定。我们侧重于单步 M 散射矩阵,并提议根据中央数据集的分解 QR 分解法实施ICS。这种系数化避免直接计算散射矩阵及其反向,并给算法带来数字稳定性。此外,行和列引出一个分解程序,允许在散射矩阵不完全排整时计算ICS。一些人工和实际的数据集表明使用新执行与原始数据集相比的兴趣。