受培训代表低级模型知道他们不知道什么他们不知道什么 (Constraining Representations Yields Models That Know What They Don't Know)

A well-known failure mode of neural networks is that they may confidently return erroneous predictions. Such unsafe behaviour is particularly frequent when the use case slightly differs from the training context, and/or in the presence of an adversary. This work presents a novel direction to address these issues in a broad, general manner: imposing class-aware constraints on a model's internal activation patterns. Specifically, we assign to each class a unique, fixed, randomly-generated binary vector - hereafter called class code - and train the model so that its cross-depths activation patterns predict the appropriate class code according to the input sample's class. The resulting predictors are dubbed total activation classifiers (TAC), and TACs may either be trained from scratch, or used with negligible cost as a thin add-on on top of a frozen, pre-trained neural network. The distance between a TAC's activation pattern and the closest valid code acts as an additional confidence score, besides the default unTAC'ed prediction head's. In the add-on case, the original neural network's inference head is completely unaffected (so its accuracy remains the same) but we now have the option to use TAC's own confidence and prediction when determining which course of action to take in an hypothetical production workflow. In particular, we show that TAC strictly improves the value derived from models allowed to reject/defer. We provide further empirical evidence that TAC works well on multiple types of architectures and data modalities and that it is at least as good as state-of-the-art alternative confidence scores derived from existing models.

翻译：众所周知的神经网络失败模式是它们可能有信心地返回错误的预测。当使用案例与培训环境略有不同和(或)对手在场时,这种不安全行为特别频繁。这项工作为以广泛和一般的方式解决这些问题提供了一个新的方向: 对模型的内部启动模式施加阶级认知限制; 具体地说, 我们为每个类指定一个独特的、固定的、随机生成的二进制矢量- 后称为类代码 - 并训练模型, 以便其跨深度启动模式根据输入样本的类别预测适当的类代码。由此产生的预测值被假称完全启动分类器( TAC ), 而 TAC 可能是从零到零的训练, 或者是用微不足道的成本来解决这些问题: 在一个冻结的、预先训练的神经网络上, 以微薄的附加成本来解决这些问题。我们给TAC 的启动模式和最接近有效的代码之间的距离是额外的信任度分数, 除了默认的 untAC 预测值。在附加的案例中, 最初的神经网络的判断值是完全不起作用的, 当它现在的TAC 的预测时, 它的精确地显示我们所选择的逻辑的模型的逻辑上, 我们的逻辑上的正确性的计算, 也显示一个特定的逻辑, 。

相关内容

TAC

关注 781

IEEE情感计算TAC(IEEE Transactions on Affective Computing)是一份跨学科的国际档案期刊，旨在传播能够识别、解释和模拟人类情感和相关情感现象的系统设计研究成果。官网地址：http://dblp.uni-trier.de/db/journals/taffco/

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日