This paper introduces the notion of ``Initial Alignment'' (INAL) between a neural network at initialization and a target function. It is proved that if a network and a Boolean target function do not have a noticeable INAL, then noisy gradient descent on a fully connected network with normalized i.i.d. initialization will not learn in polynomial time. Thus a certain amount of knowledge about the target (measured by the INAL) is needed in the architecture design. This also provides an answer to an open problem posed in [AS20]. The results are based on deriving lower-bounds for descent algorithms on symmetric neural networks without explicit knowledge of the target function beyond its INAL.
翻译:本文介绍了神经网络初始化时与目标功能之间的神经网络“初始对齐”概念,证明如果网络和布尔目标功能没有明显的INAL,那么在一个完全连接的网络上就会出现噪音梯度下降,与标准化的i.d.d. 初始化不会在多元时间学习。因此,在结构设计中需要一定程度的关于目标的知识(由INAL衡量),这也为[AS20]中出现的一个未解决的问题提供了答案。其结果基于在对称神经网络上得出较低范围的源值算法,而没有明确了解该目标功能在INAL以外的。