The strong lottery ticket hypothesis holds the promise that pruning randomly initialized deep neural networks could offer a computationally efficient alternative to deep learning with stochastic gradient descent. Common parameter initialization schemes and existence proofs, however, are focused on networks with zero biases, thus foregoing the potential universal approximation property of pruning. To fill this gap, we extend multiple initialization schemes and existence proofs to nonzero biases, including explicit 'looks-linear' approaches for ReLU activation functions. These do not only enable truly orthogonal parameter initialization but also reduce potential pruning errors. In experiments on standard benchmark data, we further highlight the practical benefits of nonzero bias initialization schemes, and present theoretically inspired extensions for state-of-the-art strong lottery ticket pruning.
翻译:强大的彩票假设有望让随机初始化的深层神经网络提供一种计算高效的替代方法,取代以随机梯度梯度下降的深层学习。 但是,共同参数初始化计划和存在证明侧重于零偏差的网络,从而避免了潜在的普世近似裁剪特性。 为了填补这一空白,我们将多个初始化计划和存在证明扩展至非零偏差, 包括“ 直观线” ReLU激活功能的明显方法。 这些不仅能够实现真正的正方位参数初始化, 还能减少潜在的倾斜错误。 在标准基准数据的实验中,我们进一步强调了非零偏差初始化计划的实际效益,并提出了由理论上启发的“ 超强彩票” 运行扩展。