Neural architecture search (NAS) has made tremendous progress in the automatic design of effective neural network structures but suffers from a heavy computational burden. One-shot NAS significantly alleviates the burden through weight sharing and improves computational efficiency. Zero-shot NAS further reduces the cost by predicting the performance of the network from its initial state, which conducts no training. Both methods aim to distinguish between "good" and "bad" architectures, i.e., ranking consistency of predicted and true performance. In this paper, we propose Ranking Distillation one-shot NAS (RD-NAS) to enhance ranking consistency, which utilizes zero-cost proxies as the cheap teacher and adopts the margin ranking loss to distill the ranking knowledge. Specifically, we propose a margin subnet sampler to distill the ranking knowledge from zero-shot NAS to one-shot NAS by introducing Group distance as margin. Our evaluation of the NAS-Bench-201 and ResNet-based search space demonstrates that RD-NAS achieve 10.7\% and 9.65\% improvements in ranking ability, respectively. Our codes are available at https://github.com/pprp/CVPR2022-NAS-competition-Track1-3th-solution
翻译:神经结构搜索(NAS)在有效神经网络结构的自动设计方面取得了巨大进展,但有沉重的计算负担。一发NAS通过权重分担和计算效率的提高大大减轻了负担。零发NAS通过从最初状态预测网络的性能进一步降低成本,而最初状态没有经过任何培训。两种方法都旨在区分“好”和“坏”结构,即预测和真实性能的排序一致性。在本文中,我们建议将一发NAS(RD-NAS)排位定一发,以提高排名一致性,利用零成本代理教师作为廉价教师,并采用差值损失来提炼排名知识。具体地说,我们提议一个边际子网取样器,通过引入集团距离作为比值,将零发NAS-Bench-201和ResNet搜索空间的分级,我们对NAS-NAS的评分显示RD-NAS在排名能力上实现了10.7 ⁇ 和9.65-PR+Q。我们的代码分别可在 httpsrack-Srp-Surph1/SG-SUB-1/RP-RP-SUB1/RB-RP-RP-RP-S-RP-RB-S-RB-S-RP-S-S-RB1/SG-RB-S-RP-S-S-S-S-RB-S-S-S-S-S-RB-S-S-RB2-PRQ-S-RB2-PRQ-RB-RB2-PRQ_G-RB2-PR-RP-RP-RP-RB2-PR-RP-RP-RP-RP-S-S-S-S-S-RP-RB1/RB-RB2-PR-RB2-PR-S-S-S-RP-RP-RB2-PR-RBP-RBP-RB2-PR-S-S-S-RP-RB-RB-RB2-PR-S-S-S-RB2-PR-S-RP-RB2-PR-RP-RP-RP-RP-RP-RP-RP-S-RB