In the literature on deep neural networks, there is considerable interest in developing activation functions that can enhance neural network performance. In recent years, there has been renewed scientific interest in proposing activation functions that can be trained throughout the learning process, as they appear to improve network performance, especially by reducing overfitting. In this paper, we propose a trainable activation function whose parameters need to be estimated. A fully Bayesian model is developed to automatically estimate from the learning data both the model weights and activation function parameters. An MCMC-based optimization scheme is developed to build the inference. The proposed method aims to solve the aforementioned problems and improve convergence time by using an efficient sampling scheme that guarantees convergence to the global maximum. The proposed scheme is tested on three datasets with three different CNNs. Promising results demonstrate the usefulness of our proposed approach in improving model accuracy due to the proposed activation function and Bayesian estimation of the parameters.
翻译:在深度神经网络的文献中,开发可以增强神经网络性能的激活函数引起了极大的兴趣。近年来,人们重新关注采用在学习过程中可以训练的激活函数,因为它们似乎可以提高网络的性能,尤其是通过减少过拟合来实现。在本文中,我们提出了一个需要估计参数的可训练激活函数。我们开发了一个完全贝叶斯模型来自动从学习数据中估计模型权重和激活函数的参数。我们开发了一个基于MCMC的优化方案来构建推理。所提出的方法旨在解决上述问题,并通过使用一种有效的采样方案来提高收敛时间,从而保证收敛到全局最大值。我们在三个数据集上测试了所提出的方案,使用三个不同的CNN。有希望的结果证明了我们所提出的方法在由于所提出的激活函数和参数的贝叶斯估计而提高模型准确性方面的有用性。