Explainable artificial intelligence provides tools to better understand predictive models and their decisions, but many such methods are limited to producing insights with respect to a single class. When generating explanations for several classes, reasoning over them to obtain a comprehensive view may be difficult since they can present competing or contradictory evidence. To address this challenge we introduce the novel paradigm of multi-class explanations. We outline the theory behind such techniques and propose a local surrogate model based on multi-output regression trees -- called LIMEtree -- that offers faithful and consistent explanations of multiple classes for individual predictions while being post-hoc, model-agnostic and data-universal. On top of strong fidelity guarantees, our implementation delivers a range of diverse explanation types, including counterfactual statements favoured in the literature. We evaluate our algorithm with respect to explainability desiderata, through quantitative experiments and via a pilot user study, on image and tabular data classification tasks, comparing it to LIME, which is a state-of-the-art surrogate explainer. Our contributions demonstrate the benefits of multi-class explanations and wide-ranging advantages of our method across a diverse set of scenarios.
翻译:暂无翻译