What makes an artificial neural network easier to train and more likely to produce desirable solutions than other comparable networks? In this paper, we provide a new angle to study such issues under the setting of a fixed number of model parameters which in general is the most dominant cost factor. We introduce a notion of variability and show that it correlates positively to the activation ratio and negatively to a phenomenon called {Collapse to Constants} (or C2C), which is closely related but not identical to the phenomenon commonly known as vanishing gradient. Experiments on a styled model problem empirically verify that variability is indeed a key performance indicator for fully connected neural networks. The insights gained from this variability study will help the design of new and effective neural network architectures.
翻译:是什么使得人工神经网络比其他类似网络更容易培训和更有可能产生可取的解决方案?在本文件中,我们提供了一个新角度,在确定固定数量的模型参数下研究这些问题,这些参数总的来说是最主要的成本因素。我们引入了可变性的概念,并表明它与激活率呈正相关关系,与被称为{与恒定值(或C2C)形成负相关,该现象与常数(或C2C)密切相关,但与通常称为消失梯度的现象不完全相同。在典型模型问题上进行的实验经验性地证实,可变性确实是完全连接神经网络的关键绩效指标。从这一可变性研究中获得的洞见将有助于设计新的有效的神经网络结构。