贝叶斯优化训练可训练激活函数的稀疏神经网络 (Bayesian optimization for sparse neural networks with trainable activation functions)

In the literature on deep neural networks, there is considerable interest in developing activation functions that can enhance neural network performance. In recent years, there has been renewed scientific interest in proposing activation functions that can be trained throughout the learning process, as they appear to improve network performance, especially by reducing overfitting. In this paper, we propose a trainable activation function whose parameters need to be estimated. A fully Bayesian model is developed to automatically estimate from the learning data both the model weights and activation function parameters. An MCMC-based optimization scheme is developed to build the inference. The proposed method aims to solve the aforementioned problems and improve convergence time by using an efficient sampling scheme that guarantees convergence to the global maximum. The proposed scheme is tested on three datasets with three different CNNs. Promising results demonstrate the usefulness of our proposed approach in improving model accuracy due to the proposed activation function and Bayesian estimation of the parameters.

翻译：在深度神经网络的文献中，对于能够提高神经网络性能的激活函数的研究具有相当的研究价值。近年来，提出了一种可在学习过程中训练的激活函数，因为它们似乎提高了网络性能，特别是通过减少过拟合的方式。本文提出了一种可训练激活函数，其参数需要被估计。我们开发了一个完全贝叶斯模型，通过学习数据自动估计模型权重和激活函数参数。开发了一个基于MCMC的优化方案来建立推断。所提出的方法旨在解决上述问题，并通过使用有效的采样方案来提高收敛时间，以保证收敛到全局最大值。该方案在三种不同的CNN和三个数据集上进行了测试。有前途的结果展示了我们提出的方法通过使用可训练激活函数和参数的贝叶斯估计来提高模型准确性。

相关内容

激活函数

关注 44

在人工神经网络中，给定一个输入或一组输入，节点的激活函数定义该节点的输出。一个标准集成电路可以看作是一个由激活函数组成的数字网络，根据输入的不同，激活函数可以是开(1)或关(0)。这类似于神经网络中的线性感知器的行为。然而，只有非线性激活函数允许这样的网络只使用少量的节点来计算重要问题，并且这样的激活函数被称为非线性。

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

63+阅读 · 2023年2月15日

【牛津大学博士论文】流形的几何优化与深度学习的应用，154页pdf，Geometric Optimisation on Manifolds with Applications to Deep Learning

专知会员服务

22+阅读 · 2022年3月21日

深度学习激活函数全面综述论文

专知会员服务

72+阅读 · 2021年10月1日