In this review, we examine the problem of designing interpretable and explainable machine learning models. Interpretability and explainability lie at the core of many machine learning and statistical applications in medicine, economics, law, and natural sciences. Although interpretability and explainability have escaped a clear universal definition, many techniques motivated by these properties have been developed over the recent 30 years with the focus currently shifting towards deep learning methods. In this review, we emphasise the divide between interpretability and explainability and illustrate these two different research directions with concrete examples of the state-of-the-art. The review is intended for a general machine learning audience with interest in exploring the problems of interpretation and explanation beyond logistic regression or random forest variable importance. This work is not an exhaustive literature survey, but rather a primer focusing selectively on certain lines of research which the authors found interesting or informative.
翻译:在本次审查中,我们研究了设计可解释和可解释的机器学习模型的问题。可解释性和可解释性是医学、经济学、法律和自然科学中许多机器学习和统计应用的核心。虽然可解释性和可解释性没有被一个明确的普遍定义所排除,但近30年来,由于这些特性的驱动力,许多技术已经发展起来,目前重点转向深层学习方法。在本次审查中,我们强调可解释性和可解释性之间的差别,并以最新技术的具体例子来说明这两个不同的研究方向。审查的目的是让一个普通的机器学习对象了解解释和解释的问题,他们有兴趣探讨超出后勤回归或随机森林重要性之外的问题。这项工作不是详尽的文献调查,而是有选择地侧重于作者认为有趣或信息丰富的某些研究方针的初级材料。