While machine learning models are usually assumed to always output a prediction, there also exist extensions in the form of reject options which allow the model to reject inputs where only a prediction with an unacceptably low certainty would be possible. With the ongoing rise of eXplainable AI, a lot of methods for explaining model predictions have been developed. However, understanding why a given input was rejected, instead of being classified by the model, is also of interest. Surprisingly, explanations of rejects have not been considered so far. We propose to use counterfactual explanations for explaining rejects and investigate how to efficiently compute counterfactual explanations of different reject options for an important class of models, namely prototype-based classifiers such as learning vector quantization models.
翻译:虽然通常假定机器学习模型总是能产生预测,但也有一些以拒绝选项的形式出现的扩展,使该模型能够拒绝投入,因为只有以令人无法接受的低确定性预测才有可能实现。随着易氧化性AI的不断上升,许多解释模型预测的方法已经开发出来。但是,理解为什么某一投入被否决而不是按照模型分类,也是值得注意的。令人惊讶的是,对拒绝的解释迄今没有被考虑。我们提议使用反事实的解释来解释拒绝,并调查如何有效计算重要类型模型的不同拒绝选项的反事实解释,即基于原型的分类器,如学习矢量量化模型。