学习专门启动功能,与Piefwise线性股合作 (Learning specialized activation functions with the Piecewise Linear Unit)

The choice of activation functions is crucial for modern deep neural networks. Popular hand-designed activation functions like Rectified Linear Unit(ReLU) and its variants show promising performance in various tasks and models. Swish, the automatically discovered activation function, has been proposed and outperforms ReLU on many challenging datasets. However, it has two main drawbacks. First, the tree-based search space is highly discrete and restricted, which is difficult for searching. Second, the sample-based searching method is inefficient, making it infeasible to find specialized activation functions for each dataset or neural architecture. To tackle these drawbacks, we propose a new activation function called Piecewise Linear Unit(PWLU), which incorporates a carefully designed formulation and learning method. It can learn specialized activation functions and achieves SOTA performance on large-scale datasets like ImageNet and COCO. For example, on ImageNet classification dataset, PWLU improves 0.9%/0.53%/1.0%/1.7%/1.0% top-1 accuracy over Swish for ResNet-18/ResNet-50/MobileNet-V2/MobileNet-V3/EfficientNet-B0. PWLU is also easy to implement and efficient at inference, which can be widely applied in real-world applications.

翻译：激活功能的选择对现代深层神经网络至关重要。普通的手工设计的激活功能, 如校正线性单元( RELU) 及其变体, 显示在各种任务和模型中的有希望的性能。自动发现的激活功能Swish 已经提出, 并且在许多具有挑战性的数据集上优于 ReLU 。但是, 它有两个主要的缺点。首先, 基于树的搜索空间高度离散和限制, 难以搜索。其次, 基于样本的搜索方法效率低下, 使得无法为每个数据设置或神经结构找到专门的激活功能。为了解决这些缺陷, 我们提议一个新的激活功能, 叫做Papwistef- 线性单元( PWLULU), 其中包括精心设计的配制和学习方法。它可以学习专门的激活功能, 并在图像网络和COCO等大型数据集上实现SOTA的性能。例如, PWLU在图像网络应用数据设置上改进了0.9%/ 0. 5/1.0%/ 1. 1.7/1.0% /1.0% 在Swish real- real- real- ResNet- 应用程序中, 也是在实际应用的P50/Wile2/Wile2/Ubwildeal) 。在实际应用中可以广泛执行。。。

相关内容

激活函数

关注 44

在人工神经网络中，给定一个输入或一组输入，节点的激活函数定义该节点的输出。一个标准集成电路可以看作是一个由激活函数组成的数字网络，根据输入的不同，激活函数可以是开(1)或关(0)。这类似于神经网络中的线性感知器的行为。然而，只有非线性激活函数允许这样的网络只使用少量的节点来计算重要问题，并且这样的激活函数被称为非线性。

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

专知会员服务

38+阅读 · 2020年4月6日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日