在人工神经网络中,给定一个输入或一组输入,节点的激活函数定义该节点的输出。一个标准集成电路可以看作是一个由激活函数组成的数字网络,根据输入的不同,激活函数可以是开(1)或关(0)。这类似于神经网络中的线性感知器的行为。然而,只有非线性激活函数允许这样的网络只使用少量的节点来计算重要问题,并且这样的激活函数被称为非线性。

VIP内容

简介:

梯度爆炸和消失的问题一直是阻碍神经网络有效训练的长期障碍。尽管在实践中采用了各种技巧和技术来缓解该问题,但仍然缺少令人满意的理论或可证明的解决方案。在本文中,我们从高维概率论的角度解决了这个问题。我们提供了严格的结果,表明在一定条件下,如果神经网络具有足够的宽度,则爆炸/消失梯度问题将很可能消失。我们的主要思想是通过一类新的激活函数(即高斯-庞加莱归一化函数和正交权重矩阵)来限制非线性神经网络中的正向和反向信号传播。在数据实验都可以验证理论,并在实际应用中将其有效性确认在非常深的神经网络上。

成为VIP会员查看完整内容
0
11

最新论文

We investigate how the activation function can be used to describe neural firing in an abstract way, and in turn, why it works well in artificial neural networks. We discuss how a spike in a biological neurone belongs to a particular universality class of phase transitions in statistical physics. We then show that the artificial neurone is, mathematically, a mean field model of biological neural membrane dynamics, which arises from modelling spiking as a phase transition. This allows us to treat selective neural firing in an abstract way, and formalise the role of the activation function in perceptron learning. The resultant statistical physical model allows us to recover the expressions for some known activation functions as various special cases. Along with deriving this model and specifying the analogous neural case, we analyse the phase transition to understand the physics of neural network learning. Together, it is shown that there is not only a biological meaning, but a physical justification, for the emergence and performance of typical activation functions; implications for neural learning and inference are also discussed.

0
0
下载
预览
Top