Counterfactual instances offer human-interpretable insight into the local behaviour of machine learning models. We propose a general framework to generate sparse, in-distribution counterfactual model explanations which match a desired target prediction with a conditional generative model, allowing batches of counterfactual instances to be generated with a single forward pass. The method is flexible with respect to the type of generative model used as well as the task of the underlying predictive model. This allows straightforward application of the framework to different modalities such as images, time series or tabular data as well as generative model paradigms such as GANs or autoencoders and predictive tasks like classification or regression. We illustrate the effectiveness of our method on image (CelebA), time series (ECG) and mixed-type tabular (Adult Census) data.
翻译:反事实实例为人提供了对机器学习模型当地行为的人类解释性洞察力。我们提议了一个总框架,以产生分散的、分布中的反事实模型解释,使预期的目标预测与一个有条件的基因化模型相匹配,允许用一个前方传票产生成批反事实事件。这种方法在使用的基因模型类型和基本预测模型的任务方面是灵活的。这样就可以直接将框架应用到不同的模式,如图像、时间序列或表格数据以及基因化模型模式,如GANs或自动编码器,以及分类或回归等预测任务。我们说明了我们关于图像(CelebA)、时间序列(ECG)和混合式表格(Adult Cension)数据的方法的有效性。