Although sparse neural networks have been studied extensively, the focus has been primarily on accuracy. In this work, we focus instead on network structure, and analyze three popular algorithms. We first measure performance when structure persists and weights are reset to a different random initialization, thereby extending experiments in Deconstructing Lottery Tickets (Zhou et al., 2019). This experiment reveals that accuracy can be derived from structure alone. Second, to measure structural robustness we investigate the sensitivity of sparse neural networks to further pruning after training, finding a stark contrast between algorithms. Finally, for a recent dynamic sparsity algorithm we investigate how early in training the structure emerges. We find that even after one epoch the structure is mostly determined, allowing us to propose a more efficient algorithm which does not require dense gradients throughout training. In looking back at algorithms for sparse neural networks and analyzing their performance from a different lens, we uncover several interesting properties and promising directions for future research.
翻译:尽管对稀疏的神经网络进行了广泛研究,但重点主要在于准确性。在这项工作中,我们侧重于网络结构,分析三种流行的算法。我们首先测量结构存续时的性能,而重量被重置到不同的随机初始化阶段,从而扩展了脱线彩票(Zhou等人,2019年)的实验。这一实验表明,精确性可以仅从结构上得出。第二,为了测量结构稳健性,我们调查了稀散神经网络在培训后进一步运行的敏感性,发现了各种算法之间的鲜明对比。最后,对于最近一种动态的超常算法,我们调查了结构在培训中的早期出现。我们发现,即便在一次撞击后,结构大多已经确定,使我们能够提出一种更有效的算法,而在整个培训过程中不需要密度梯度。在回顾稀疏的神经网络的算法并从不同角度分析其性能时,我们发现了一些有趣的特性和有希望的未来研究方向。