The Lottery Ticket Hypothesis (LTH) states that for a reasonably sized neural network, a sub-network within the same network yields no less performance than the dense counterpart when trained from the same initialization. This work investigates the relation between model size and the ease of finding these sparse sub-networks. We show through experiments that, surprisingly, under a finite budget, smaller models benefit more from Ticket Search (TS).
翻译:彩票假说(LTH)指出,对于一个规模合理的神经网络,同一网络内的子网络的性能不亚于从同一初始化中受训的密度高的对等网络。 这项工作调查了模型大小之间的关系和找到这些稀疏的子网络的方便程度。 我们通过实验发现,在有限的预算下,小模型从票类搜索中受益更多,这令人惊讶。