Explaining the behavior of black box machine learning models through human interpretable rules is an important research area. Recent work has focused on explaining model behavior locally i.e. for specific predictions as well as globally across the fields of vision, natural language, reinforcement learning and data science. We present a novel model-agnostic approach that derives rules to globally explain the behavior of classification models trained on numerical and/or categorical data. Our approach builds on top of existing local model explanation methods to extract conditions important for explaining model behavior for specific instances followed by an evolutionary algorithm that optimizes an information theory based fitness function to construct rules that explain global model behavior. We show how our approach outperforms existing approaches on a variety of datasets. Further, we introduce a parameter to evaluate the quality of interpretation under the scenario of distributional shift. This parameter evaluates how well the interpretation can predict model behavior for previously unseen data distributions. We show how existing approaches for interpreting models globally lack distributional robustness. Finally, we show how the quality of the interpretation can be improved under the scenario of distributional shift by adding out of distribution samples to the dataset used to learn the interpretation and thereby, increase robustness. All of the datasets used in our paper are open and publicly available. Our approach has been deployed in a leading digital marketing suite of products.
翻译:通过人类可解释的规则解释黑盒机器学习模型的行为是一个重要的研究领域。最近的工作侧重于解释当地的行为模型,即具体预测以及各种视觉、自然语言、强化学习和数据科学领域的典型行为模型。我们提出了一个新型的模型 -- -- 不可知性方法,从全球角度解释在数字和(或)绝对数据方面受过培训的分类模型的行为。我们的方法建立在现有的当地示范解释方法之上,以找出解释具体事例的模型行为的重要条件,然后采用进化算法,优化基于信息理论的健身功能,以构建解释全球模型行为的规则。我们展示了我们的方法如何超越各种数据集的现有方法。此外,我们引入了一个参数,用以评估分布式转变情景下的解释质量。这个参数评估了解释模型对先前未见数据分布式分布式分布式分布式分布式分布式分布式分布式分布式分布式分布式分布式分布式分布式分布式分布式的模型的行为模型的准确性。我们展示了现有解释方法的质量如何在分布式转换式分布式转换式结构下得到改进,方法是将分发式分布式分布式分布式的样本添加到用于学习解释和引导我们所使用的数字式销售式产品时,从而提高了我们所使用的数据。在公开式销售式产品中的可靠度。