The Lottery Ticket Hypothesis (LTH) showed that by iteratively training a model, removing connections with the lowest global weight magnitude and rewinding the remaining connections, sparse networks can be extracted. This global comparison removes context information between connections within a layer. Here we study means for recovering some of this layer distributional context and generalise the LTH to consider weight importance values rather than global weight magnitudes. We find that given a repeatable training procedure, applying different importance metrics leads to distinct performant lottery tickets with little overlapping connections. This strongly suggests that lottery tickets are not unique
翻译:Lottery Ticket假设(LTH)显示,通过反复培训模型,消除与全球最低重量级的联系,并倒转其余连接,可以提取稀疏的网络。这种全球比较可以去除一个层内连接的上下文信息。我们在这里研究恢复部分层分布环境的手段,并概括LTH以考虑重量重要性值而不是全球重量级。我们发现,根据可重复使用的培训程序,采用不同的重要性指标,可以导致不同性能彩票,但连接很少。这有力地表明彩票并非独一无二。