As complex machine learning models are increasingly used in sensitive applications like banking, trading or credit scoring, there is a growing demand for reliable explanation mechanisms. Local feature attribution methods have become a popular technique for post-hoc and model-agnostic explanations. However, attribution methods typically assume a stationary environment in which the predictive model has been trained and remains stable. As a result, it is often unclear how local attributions behave in realistic, constantly evolving settings such as streaming and online applications. In this paper, we discuss the impact of temporal change on local feature attributions. In particular, we show that local attributions can become obsolete each time the predictive model is updated or concept drift alters the data generating distribution. Consequently, local feature attributions in data streams provide high explanatory power only when combined with a mechanism that allows us to detect and respond to local changes over time. To this end, we present CDLEEDS, a flexible and model-agnostic framework for detecting local change and concept drift. CDLEEDS serves as an intuitive extension of attribution-based explanation techniques to identify outdated local attributions and enable more targeted recalculations. In experiments, we also show that the proposed framework can reliably detect both local and global concept drift. Accordingly, our work contributes to a more meaningful and robust explainability in online machine learning.
翻译:由于复杂的机器学习模式越来越多地用于银行、交易或信用评分等敏感应用,对可靠解释机制的需求日益增加。当地特性归属方法已成为热后和模型不可知解释的流行技术。然而,归因方法通常假定一种固定环境,预测模型经过培训并保持稳定。因此,往往不清楚当地特性如何在流流和在线应用等现实、不断演变的环境中发挥作用。本文讨论了时间变化对当地特性归属的影响。特别是,我们表明,每次预测模型更新或概念漂移改变数据生成分布,当地特性归属方法就会过时。因此,数据流中的当地特性归属只有在与一个机制相结合,使我们能够探测和应对当地变化的情况下,才能提供很高的解释性力量。为此,我们提出了CDLEEDS,一个用于探测当地变化和概念漂移的灵活和模型性框架。CDLEEDS作为基于归属的解释技术的直观延伸,用以识别过时的当地特性,并使更有针对性的重新计算能够改变数据的分布。因此,在实验中,我们还可以可靠地、更可靠地、更可靠地解释一个在线学习的系统框架。