In safety-critical applications, practitioners are reluctant to trust neural networks when no interpretable explanations are available. Many attempts to provide such explanations revolve around pixel-based attributions or use previously known concepts. In this paper we aim to provide explanations by provably identifying \emph{high-level, previously unknown ground-truth concepts}. To this end, we propose a probabilistic modeling framework to derive (C)oncept (L)earning and (P)rediction (CLAP) -- a VAE-based classifier that uses visually interpretable concepts as predictors for a simple classifier. Assuming a generative model for the ground-truth concepts, we prove that CLAP is able to identify them while attaining optimal classification accuracy. Our experiments on synthetic datasets verify that CLAP identifies distinct ground-truth concepts on synthetic datasets and yields promising results on the medical Chest X-Ray dataset.
翻译:在安全关键应用中,当没有可解释的解释时,执业者不愿意信任神经网络。许多试图提供这种解释的尝试都围绕着基于像素的属性或使用以前已知的概念。在本文件中,我们的目标是通过可辨别到\emph{高层次,以前未知的地面真相概念来提供解释。为此,我们提出了一个概率模型框架,以得出(C)受体(L)学习和(P)受体(CLAP) -- -- 以VAE为基础的分类器,该分类器使用视觉可解释的概念作为简单分类器的预测器。假设地面真相概念的基因模型,我们证明CLAP能够在达到最佳分类准确性的同时识别它们。我们在合成数据集方面的实验证实,CLAP在合成数据集上确定了独特的地面真相概念,并在医疗胸X光数据集上产生了令人乐观的结果。