机器解释和人类理解 (Machine Explanations and Human Understanding)

Explanations are hypothesized to improve human understanding of machine learning models and achieve a variety of desirable outcomes, ranging from model debugging to enhancing human decision making. However, empirical studies have found mixed and even negative results. An open question, therefore, is under what conditions explanations can improve human understanding and in what way. Using adapted causal diagrams, we provide a formal characterization of the interplay between machine explanations and human understanding, and show how human intuitions play a central role in enabling human understanding. Specifically, we identify three core concepts of interest that cover all existing quantitative measures of understanding in the context of human-AI decision making: task decision boundary, model decision boundary, and model error. Our key result is that without assumptions about task-specific intuitions, explanations may potentially improve human understanding of model decision boundary, but they cannot improve human understanding of task decision boundary or model error. To achieve complementary human-AI performance, we articulate possible ways on how explanations need to work with human intuitions. For instance, human intuitions about the relevance of features (e.g., education is more important than age in predicting a person's income) can be critical in detecting model error. We validate the importance of human intuitions in shaping the outcome of machine explanations with empirical human-subject studies. Overall, our work provides a general framework along with actionable implications for future algorithmic development and empirical experiments of machine explanations.

翻译：解释被假定能够改善人类对于机器学习模型的理解并实现多种可取的结果，从模型调试到增强人类决策能力。但是，实证研究发现了混乱和甚至是负面的结果。一个未解决的问题是，在什么条件下解释可以改善人类理解，以及如何改善。我们使用适配的因果图表征机器解释和人类理解之间的相互作用，并展示人类直觉在启发人类理解中起到了中心作用。具体而言，我们确定了三个核心概念，覆盖了所有人类-AI决策制定上现有的量化理解度量：任务决策边界，模型决策边界和模型误差。我们的关键结果是，在没有关于任务特定直觉的假设下，解释可能有潜力改善人类理解模型决策边界，但它们不能改善人类理解任务决策边界或模型误差。为了实现互补的人类-AI效果，我们阐明了解释需要如何与人类直觉相结合的可能途径。例如，人们关于功能的相关性的直觉（例如，教育对比年龄更重要，可以预测一个人的收入）在检测模型误差方面至关重要。我们用基于人类受试者的实证研究验证了人类直觉在塑造机器解释的结果方面的重要性。总体而言，我们的工作提供了一个通用的框架，以及未来机器解释算法开发和实证实验的可操作性方案。