相似性-距离-幅度激活函数 (Similarity-Distance-Magnitude Activations)

We introduce the Similarity-Distance-Magnitude (SDM) activation function, a more robust and interpretable formulation of the standard softmax activation function, adding Similarity (i.e., correctly predicted depth-matches into training) awareness and Distance-to-training-distribution awareness to the existing output Magnitude (i.e., decision-boundary) awareness, and enabling interpretability-by-exemplar via dense matching. We further introduce the SDM estimator, based on a data-driven partitioning of the class-wise empirical CDFs via the SDM activation, to control the class- and prediction-conditional accuracy among selective classifications. When used as the final-layer activation over pre-trained language models for selective classification, the SDM estimator is more robust to co-variate shifts and out-of-distribution inputs than existing calibration methods using softmax activations, while remaining informative over in-distribution data.

翻译：本文提出了相似性-距离-幅度（SDM）激活函数，该函数是对标准softmax激活函数的一种更鲁棒且可解释的改进形式。它在现有输出幅度（即决策边界）感知的基础上，增加了相似性感知（即正确预测的深度匹配融入训练）和到训练分布的距离感知，并通过密集匹配实现了基于范例的可解释性。我们进一步提出了SDM估计器，该方法基于通过SDM激活对类别经验累积分布函数进行数据驱动的划分，以控制选择性分类中的类别条件准确率与预测条件准确率。当在预训练语言模型的最顶层作为激活函数用于选择性分类时，与使用softmax激活的现有校准方法相比，SDM估计器对协变量偏移和分布外输入具有更强的鲁棒性，同时在分布内数据上仍能保持信息有效性。

相关内容

激活函数

关注 44

在人工神经网络中，给定一个输入或一组输入，节点的激活函数定义该节点的输出。一个标准集成电路可以看作是一个由激活函数组成的数字网络，根据输入的不同，激活函数可以是开(1)或关(0)。这类似于神经网络中的线性感知器的行为。然而，只有非线性激活函数允许这样的网络只使用少量的节点来计算重要问题，并且这样的激活函数被称为非线性。

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日