后热后概念概念瓶颈模型 (Post-hoc Concept Bottleneck Models)

Concept Bottleneck Models (CBMs) map the inputs onto a set of interpretable concepts (``the bottleneck'') and use the concepts to make predictions. A concept bottleneck enhances interpretability since it can be investigated to understand what concepts the model "sees" in an input and which of these concepts are deemed important. However, CBMs are restrictive in practice as they require dense concept annotations in the training data to learn the bottleneck. Moreover, CBMs often do not match the accuracy of an unrestricted neural network, reducing the incentive to deploy them in practice. In this work, we address these limitations of CBMs by introducing Post-hoc Concept Bottleneck models (PCBMs). We show that we can turn any neural network into a PCBM without sacrificing model performance while still retaining the interpretability benefits. When concept annotations are not available on the training data, we show that PCBM can transfer concepts from other datasets or from natural language descriptions of concepts via multimodal models. A key benefit of PCBM is that it enables users to quickly debug and update the model to reduce spurious correlations and improve generalization to new distributions. PCBM allows for global model edits, which can be more efficient than previous works on local interventions that fix a specific prediction. Through a model-editing user study, we show that editing PCBMs via concept-level feedback can provide significant performance gains without using data from the target domain or model retraining.

翻译：一种概念瓶颈可以增强解释性,因为可以对它进行调查,以了解在输入中“看到”的模式是什么概念,而认为这些概念中哪些是重要的概念。然而,建立信任措施在实践中是限制性的,因为它们需要培训数据中密集的概念说明,以了解瓶颈。此外,建立信任措施往往不匹配一个不受限制的神经网络的准确性,减少在实践中部署这些网络的动力。在这项工作中,我们通过引入“P-Hoc 概念瓶颈”模型(P-BBIS)来解决建立信任措施的这些局限性。我们表明,我们可以将任何神经网络转化为PCM,而不牺牲模型性能,同时保留解释性效益。当培训数据没有提供概念说明时,我们表明PCM可以通过多式联运模型将概念从其他数据集或概念的自然语言模型描述中转移概念。PCBM的主要好处是,它能够使用户快速调试和更新模型,通过P-BIS标准模型来减少令人反感性关系的关联性关系,并改进P-BCM标准水平,我们可以通过以往的统计方法来显示具体的绩效。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/