In recent years, the use of sophisticated statistical models that influence decisions in domains of high societal relevance is on the rise. Although these models can often bring substantial improvements in the accuracy and efficiency of organizations, many governments, institutions, and companies are reluctant to their adoption as their output is often difficult to explain in human-interpretable ways. Hence, these models are often regarded as black-boxes, in the sense that their internal mechanisms can be opaque to human audit. In real-world applications, particularly in domains where decisions can have a sensitive impact--e.g., criminal justice, estimating credit scores, insurance risk, health risks, etc.--model interpretability is desired. Recently, the academic literature has proposed a substantial amount of methods for providing interpretable explanations to machine learning models. This survey reviews the most relevant and novel methods that form the state-of-the-art for addressing the particular problem of explaining individual instances in machine learning. It seeks to provide a succinct review that can guide data science and machine learning practitioners in the search for appropriate methods to their problem domain.
翻译:近年来,对影响社会高度相关领域决策的复杂统计模型的使用呈上升趋势,尽管这些模型往往能够大大提高组织、许多政府、机构和公司准确性和效率,但不愿采用这些模型,因为其产出往往难以以人文解释的方式解释,因此,这些模型往往被视为黑盒,因为其内部机制可能对人类审计不透明。在现实世界的应用中,特别是在决策可能产生敏感影响的领域,例如刑事司法、估计信用分数、保险风险、健康风险等。最近,学术文献提出了大量方法,为机器学习模型提供可解释的解释性解释。本调查审查了最相关和新颖的方法,这些方法构成了解决机器学习中解释个别案例的具体问题的最相关和最新颖的方法。它力求提供简明的审查,指导数据科学和机器学习实践者寻找解决问题的适当方法。