Activation functions (AFs), which are pivotal to the success (or failure) of a neural network, have received increased attention in recent years, with researchers seeking to design novel AFs that improve some aspect of network performance. In this paper we take another direction, wherein we combine a slew of known AFs into successful architectures, proposing three methods to do so beneficially: 1) generate AF architectures at random, 2) use Optuna, an automatic hyper-parameter optimization software framework, with a Tree-structured Parzen Estimator (TPE) sampler, and 3) use Optuna with a Covariance Matrix Adaptation Evolution Strategy (CMA-ES) sampler. We show that all methods often produce significantly better results for 25 classification problems when compared with a standard network composed of ReLU hidden units and a softmax output unit. Optuna with the TPE sampler emerged as the best AF architecture-producing method.
翻译:近年来,对神经网络成功(或失败)至关重要的激活功能(AFs)日益受到关注,研究人员试图设计新的AFs,以改善网络绩效的某些方面。在本文中,我们采取了另一个方向,将已知的AFs组合成成功的架构,提出三种有利方法:1)随机生成AF结构,2)使用自动超参数优化软件框架Optuna,即自动超参数优化软件框架,配有树木结构的Parzen Estimator(TPE)取样器,3)使用带有变量矩阵适应进化战略(CMA-ES)取样器的Optuna。我们表明,与由RELU隐藏单元组成的标准网络和一个软式输出器相比,所有方法往往对25个分类问题产生显著更好的结果。与TRE样本一起出现的Optuna是最佳的AFPS生成方法。