基于DualXDA的稀疏、高效且可解释的数据归因方法 (Sparse, Efficient and Explainable Data Attribution with DualXDA)

Data Attribution (DA) is an emerging approach in the field of eXplainable Artificial Intelligence (XAI), aiming to identify influential training datapoints which determine model outputs. It seeks to provide transparency about the model and individual predictions, e.g. for model debugging, identifying data-related causes of suboptimal performance. However, existing DA approaches suffer from prohibitively high computational costs and memory demands when applied to even medium-scale datasets and models, forcing practitioners to resort to approximations that may fail to capture the true inference process of the underlying model. Additionally, current attribution methods exhibit low sparsity, resulting in non-negligible attribution scores across a high number of training examples, hindering the discovery of decisive patterns in the data. In this work, we introduce DualXDA, a framework for sparse, efficient and explainable DA, comprised of two interlinked approaches, Dual Data Attribution (DualDA) and eXplainable Data Attribution (XDA): With DualDA, we propose a novel approach for efficient and effective DA, leveraging Support Vector Machine theory to provide fast and naturally sparse data attributions for AI predictions. In extensive quantitative analyses, we demonstrate that DualDA achieves high attribution quality, excels at solving a series of evaluated downstream tasks, while at the same time improving explanation time by a factor of up to 4,100,000x compared to the original Influence Functions method, and up to 11,000x compared to the method's most efficient approximation from literature to date. We further introduce XDA, a method for enhancing Data Attribution with capabilities from feature attribution methods to explain why training samples are relevant for the prediction of a test sample in terms of impactful features, which we showcase and verify qualitatively in detail.

翻译：数据归因（DA）是可解释人工智能（XAI）领域的一种新兴方法，旨在识别决定模型输出的有影响力的训练数据点。它试图为模型及个体预测提供透明度，例如用于模型调试、识别与数据相关的次优性能成因。然而，现有的DA方法在应用于中等规模数据集和模型时，存在计算成本和内存需求过高的问题，迫使从业者采用可能无法捕捉底层模型真实推理过程的近似方法。此外，当前的归因方法表现出低稀疏性，导致大量训练样本均具有不可忽略的归因分数，从而阻碍了数据中决定性模式的发现。在本工作中，我们提出了DualXDA，一个用于稀疏、高效且可解释的DA框架，包含两个相互关联的方法：双重数据归因（DualDA）与可解释数据归因（XDA）。通过DualDA，我们提出了一种新颖的高效且有效的DA方法，利用支持向量机理论为AI预测提供快速且天然稀疏的数据归因。在广泛的定量分析中，我们证明DualDA实现了较高的归因质量，在一系列评估的下游任务中表现出色，同时与原始影响函数方法相比，解释时间缩短了高达4,100,000倍，与文献中迄今最高效的近似方法相比，也缩短了高达11,000倍。我们进一步提出了XDA，这是一种利用特征归因方法的能力来增强数据归因的方法，旨在通过有影响力的特征解释训练样本为何与测试样本的预测相关，我们对此进行了详细的定性展示和验证。

相关内容