For the purpose of explaining multivariate outlyingness, it is shown that the squared Mahalanobis distance of an observation can be decomposed into outlyingness contributions originating from single variables. The decomposition is obtained using the Shapley value, a well-known concept from game theory that became popular in the context of Explainable AI. In addition to outlier explanation, this concept also relates to the recent formulation of cellwise outlyingness, where Shapley values can be employed to obtain variable contributions for outlying observations with respect to their "expected" position given the multivariate data structure. In combination with squared Mahalanobis distances, Shapley values can be calculated at a low numerical cost, making them even more attractive for outlier interpretation. Simulations and real-world data examples demonstrate the usefulness of these concepts.
翻译:为了解释多变量的偏差,可以表明,观测的平方马哈拉诺比距离可以分解成单一变量的偏差贡献;分解利用Shapley值获得,这是在可解释的AI中流行的一个众所周知的游戏理论概念;除了外推解释外,这一概念还涉及最近提出的细胞偏差,根据多变量数据结构,可使用Shapley值获得可变贡献,以得出有关其“预期”位置的观测结果。与平方马哈拉诺比距离相结合,可以低数字成本计算Shapley值,使其对外部解释更有吸引力。模拟和真实世界数据实例显示了这些概念的有用性。