随机激活函数 (Stochastic activations)

Maria Lomeli,Matthijs Douze,Gergely Szilvasy,Loic Cabannes,Jade Copet,Sainbayar Sukhbaatar,Jason Weston,Gabriel Synnaeve,Pierre-Emmanuel Mazaré,Hervé Jégou

We introduce stochastic activations. This novel strategy randomly selects between several non-linear functions in the feed-forward layer of a large language model. In particular, we choose between SILU or RELU depending on a Bernoulli draw. This strategy circumvents the optimization problem associated with RELU, namely, the constant shape for negative inputs that prevents the gradient flow. We leverage this strategy in two ways: (1) We use stochastic activations during pre-training and fine-tune the model with RELU, which is used at inference time to provide sparse latent vectors. This reduces the inference FLOPs and translates into a significant speedup on CPU and GPU. This leads to better results than training from scratch with the RELU activation function. (2) We evaluate stochastic activations for sequence generation. This strategy performs reasonably well: it has higher diversity and has only slightly inferior performance to the best deterministic non-linearity, SILU, combined with temperature sampling. This provides an alternative way to increase the diversity of generated text.

翻译：本文提出随机激活函数这一新颖策略。该策略在大型语言模型的前馈层中随机选择多种非线性函数进行激活。具体而言，我们依据伯努利分布的结果在SILU与RELU之间进行选择。该策略有效规避了RELU函数存在的优化问题——即负值输入区间的恒定形态会阻碍梯度流动。我们通过两种方式利用此策略：（1）在预训练阶段采用随机激活函数，随后使用RELU对模型进行微调，推理阶段则采用RELU生成稀疏隐向量。该方法降低了推理时的浮点运算量，在CPU和GPU上均实现了显著加速，且效果优于直接使用RELU激活函数从头训练的模型。（2）我们将随机激活函数应用于序列生成任务进行评估。该策略表现出良好性能：在保持与最佳确定性非线性函数SILU（配合温度采样）相近性能的同时，显著提升了生成文本的多样性，为增强文本生成多样性提供了新的技术路径。

相关内容

激活函数

关注 44

在人工神经网络中，给定一个输入或一组输入，节点的激活函数定义该节点的输出。一个标准集成电路可以看作是一个由激活函数组成的数字网络，根据输入的不同，激活函数可以是开(1)或关(0)。这类似于神经网络中的线性感知器的行为。然而，只有非线性激活函数允许这样的网络只使用少量的节点来计算重要问题，并且这样的激活函数被称为非线性。

【ICCV2021】参数化对比学习

专知会员服务

33+阅读 · 2021年7月27日

【ICML2021】具有性能保证的弱监督下的对抗性多类学习

专知会员服务

17+阅读 · 2021年7月13日

【ICML2021】基于低秩重参数化的大规模私有学习

专知会员服务

12+阅读 · 2021年6月20日

【CVPR2021】在类别不平衡的数据上施展半监督学习

专知会员服务

38+阅读 · 2021年3月29日