The strong lottery ticket hypothesis has highlighted the potential for training deep neural networks by pruning, which has inspired interesting practical and theoretical insights into how neural networks can represent functions. For networks with ReLU activation functions, it has been proven that a target network with depth $L$ can be approximated by the subnetwork of a randomly initialized neural network that has double the target's depth $2L$ and is wider by a logarithmic factor. We show that a depth $L+1$ network is sufficient. This result indicates that we can expect to find lottery tickets at realistic, commonly used depths while only requiring logarithmic overparametrization. Our novel construction approach applies to a large class of activation functions and is not limited to ReLUs.
翻译:强大的彩票假设凸显了通过裁剪培训深神经网络的潜力,这激发了人们对神经网络如何代表功能的有趣实际和理论的洞察力。 对于具有RELU激活功能的网络来说,事实证明一个深度为美元的目标网络可以通过随机初始化神经网络的子网络近似于一个深度为2L美元的目标网络,该网络的深度是目标的两倍,并且是一个对数系数的宽度。我们显示一个深度为1L+1美元的网络就足够了。这个结果表明,我们可以期望在现实、常用的深度找到彩票,而只需要对数过度对称。我们的新建筑方法适用于大型的激活功能,而不限于RLUs。