Explaining artificial intelligence (AI) predictions is increasingly important and even imperative in many high-stakes applications where humans are the ultimate decision-makers. In this work, we propose two novel architectures of self-interpretable image classifiers that first explain, and then predict (as opposed to post-hoc explanations) by harnessing the visual correspondences between a query image and exemplars. Our models consistently improve (by 1 to 4 points) on out-of-distribution (OOD) datasets while performing marginally worse (by 1 to 2 points) on in-distribution tests than ResNet-50 and a $k$-nearest neighbor classifier (kNN). Via a large-scale, human study on ImageNet and CUB, our correspondence-based explanations are found to be more useful to users than kNN explanations. Our explanations help users more accurately reject AI's wrong decisions than all other tested methods. Interestingly, for the first time, we show that it is possible to achieve complementary human-AI team accuracy (i.e., that is higher than either AI-alone or human-alone), in ImageNet and CUB image classification tasks.
翻译:人工智能(AI)预测在人类是最终决策者的很多高端应用中越来越重要,甚至更加必要。 在这项工作中,我们提出了两种自我解释图像分类的新型结构,先是解释,然后通过利用查询图像和Exmplors之间的视觉对应来预测(而不是热后解释 ) 。 我们的模型在分配外(OOOD)数据集方面不断改进(1至4个百分点),同时在分配测试方面比ResNet-50和美元最接近的邻居分类(KNN)低一点(1至2个百分点 ) 。 通过对图像网络和 CUB的大规模人类研究,我们的通信解释被认为比 kNN解释对用户更有用。 我们的解释有助于用户比所有其他测试方法更准确地拒绝AI的错误决定。 有趣的是,我们第一次显示有可能在图像网络和 CUB图像分类中实现人类-AI团队的互补性( 即高于AI ) 。