In the literature on deep neural networks, there is considerable interest in developing activation functions that can enhance neural network performance. In recent years, there has been renewed scientific interest in proposing activation functions that can be trained throughout the learning process, as they appear to improve network performance, especially by reducing overfitting. In this paper, we propose a trainable activation function whose parameters need to be estimated. A fully Bayesian model is developed to automatically estimate from the learning data both the model weights and activation function parameters. An MCMC-based optimization scheme is developed to build the inference. The proposed method aims to solve the aforementioned problems and improve convergence time by using an efficient sampling scheme that guarantees convergence to the global maximum. The proposed scheme is tested on three datasets with three different CNNs. Promising results demonstrate the usefulness of our proposed approach in improving model accuracy due to the proposed activation function and Bayesian estimation of the parameters.
翻译:在深度神经网络的文献中,对于能够提高神经网络性能的激活函数的研究具有相当的研究价值。近年来,提出了一种可在学习过程中训练的激活函数,因为它们似乎提高了网络性能,特别是通过减少过拟合的方式。本文提出了一种可训练激活函数,其参数需要被估计。我们开发了一个完全贝叶斯模型,通过学习数据自动估计模型权重和激活函数参数。开发了一个基于MCMC的优化方案来建立推断。所提出的方法旨在解决上述问题,并通过使用有效的采样方案来提高收敛时间,以保证收敛到全局最大值。该方案在三种不同的CNN和三个数据集上进行了测试。有前途的结果展示了我们提出的方法通过使用可训练激活函数和参数的贝叶斯估计来提高模型准确性。