With the increasing complexity of black-box machine learning models and their adoption in high-stakes areas, it is critical to provide explanations for their predictions. Local Interpretable Model-agnostic Explanation (LIME) is a widely used technique that explains the prediction of any classifier by learning an interpretable model locally around the predicted instance. However, it assumes that the local decision boundary is linear and fails to capture the non-linear relationships, leading to incorrect explanations. In this paper, we propose a novel method that can generate high-fidelity explanations. Multivariate adaptive regression splines (MARS) is used to model non-linear local boundaries that effectively captures the underlying behavior of the reference model, thereby enhancing the local fidelity of the explanation. Additionally, we utilize the N-ball sampling technique, which samples directly from the desired distribution instead of reweighting samples as done in LIME, further improving the faithfulness score. We evaluate our method on three UCI datasets across different classifiers and varying kernel widths. Experimental results show that our method yields more faithful explanations compared to baselines, achieving an average reduction of 37% in root mean square error, significantly improving local fidelity.
翻译:随着黑盒机器学习模型日益复杂及其在高风险领域的广泛应用,为其预测提供解释变得至关重要。局部可解释模型无关解释(LIME)是一种广泛使用的技术,它通过在预测实例周围局部学习一个可解释模型来解释任意分类器的预测结果。然而,该方法假设局部决策边界为线性,无法捕捉非线性关系,导致解释结果不准确。本文提出一种能够生成高保真解释的新方法。我们采用多元自适应回归样条(MARS)对非线性局部边界进行建模,有效捕捉参考模型的内在行为,从而提升解释的局部保真度。此外,我们利用N球采样技术直接从目标分布中采样,而非如LIME那样对样本进行重新加权,进一步提高了忠实度评分。我们在三个UCI数据集上,针对不同分类器及多种核宽度评估了所提方法。实验结果表明,与基线方法相比,我们的方法能产生更忠实的解释,在均方根误差上平均降低37%,显著改善了局部保真度。