在人工神经网络中,给定一个输入或一组输入,节点的激活函数定义该节点的输出。一个标准集成电路可以看作是一个由激活函数组成的数字网络,根据输入的不同,激活函数可以是开(1)或关(0)。这类似于神经网络中的线性感知器的行为。然而,只有非线性激活函数允许这样的网络只使用少量的节点来计算重要问题,并且这样的激活函数被称为非线性。

VIP内容

简介:

梯度爆炸和消失的问题一直是阻碍神经网络有效训练的长期障碍。尽管在实践中采用了各种技巧和技术来缓解该问题,但仍然缺少令人满意的理论或可证明的解决方案。在本文中,我们从高维概率论的角度解决了这个问题。我们提供了严格的结果,表明在一定条件下,如果神经网络具有足够的宽度,则爆炸/消失梯度问题将很可能消失。我们的主要思想是通过一类新的激活函数(即高斯-庞加莱归一化函数和正交权重矩阵)来限制非线性神经网络中的正向和反向信号传播。在数据实验都可以验证理论,并在实际应用中将其有效性确认在非常深的神经网络上。

成为VIP会员查看完整内容
0
9

最新内容

The expressiveness of deep neural networks of bounded width has recently been investigated in a series of articles. The understanding about the minimum width needed to ensure universal approximation for different kind of activation functions has progressively been extended (Park et al., 2020). In particular, it turned out that, with respect to approximation on general compact sets in the input space, a network width less than or equal to the input dimension excludes universal approximation. In this work, we focus on network functions of width less than or equal to the latter critical bound. We prove that in this regime the exact fit of partially constant functions on disjoint compact sets is still possible for ReLU network functions under some conditions on the mutual location of these components. Conversely, we conclude from a maximum principle that for all continuous and monotonic activation functions, universal approximation of arbitrary continuous functions is impossible on sets that coincide with the boundary of an open set plus an inner point of that set. We also show that some network functions of maximum width two, respectively one, allow universal approximation on finite sets.

0
0
下载
预览

最新论文

The expressiveness of deep neural networks of bounded width has recently been investigated in a series of articles. The understanding about the minimum width needed to ensure universal approximation for different kind of activation functions has progressively been extended (Park et al., 2020). In particular, it turned out that, with respect to approximation on general compact sets in the input space, a network width less than or equal to the input dimension excludes universal approximation. In this work, we focus on network functions of width less than or equal to the latter critical bound. We prove that in this regime the exact fit of partially constant functions on disjoint compact sets is still possible for ReLU network functions under some conditions on the mutual location of these components. Conversely, we conclude from a maximum principle that for all continuous and monotonic activation functions, universal approximation of arbitrary continuous functions is impossible on sets that coincide with the boundary of an open set plus an inner point of that set. We also show that some network functions of maximum width two, respectively one, allow universal approximation on finite sets.

0
0
下载
预览
Top