Pi-NAS:通过减少超级网络培训一致性调整,改进神经结构搜索 (Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift)

Recently proposed neural architecture search (NAS) methods co-train billions of architectures in a supernet and estimate their potential accuracy using the network weights detached from the supernet. However, the ranking correlation between the architectures' predicted accuracy and their actual capability is incorrect, which causes the existing NAS methods' dilemma. We attribute this ranking correlation problem to the supernet training consistency shift, including feature shift and parameter shift. Feature shift is identified as dynamic input distributions of a hidden layer due to random path sampling. The input distribution dynamic affects the loss descent and finally affects architecture ranking. Parameter shift is identified as contradictory parameter updates for a shared layer lay in different paths in different training steps. The rapidly-changing parameter could not preserve architecture ranking. We address these two shifts simultaneously using a nontrivial supernet-Pi model, called Pi-NAS. Specifically, we employ a supernet-Pi model that contains cross-path learning to reduce the feature consistency shift between different paths. Meanwhile, we adopt a novel nontrivial mean teacher containing negative samples to overcome parameter shift and model collision. Furthermore, our Pi-NAS runs in an unsupervised manner, which can search for more transferable architectures. Extensive experiments on ImageNet and a wide range of downstream tasks (e.g., COCO 2017, ADE20K, and Cityscapes) demonstrate the effectiveness and universality of our Pi-NAS compared to supervised NAS. See Codes: https://github.com/Ernie1/Pi-NAS.

翻译：最近提议的神经结构搜索方法(NAS) 将数十亿个建筑在超级网中进行共载体搜索,并使用从超级网中分离的网络重量来估计其潜在准确性。但是,这些建筑预测的准确性与实际能力之间的排序关联是不正确的, 这导致了现有的NAS方法的两难处境。我们将这一排序相关问题归因于超级网培训一致性的转变, 包括地貌转移和参数转移。特征转换被确定为随机路径抽样导致的隐藏层的动态输入分布。输入分布动态会影响损失的下降, 并最终影响结构的排序。参数转换被确定为位于不同培训步骤中共享层的相互矛盾的参数更新。快速变化参数无法保存结构排序。我们同时使用非三角超级网- Pi-NAS 模式来解决这两个变化的关联问题。具体地说, 我们使用一个包含交叉学习的网络- Pi- Pi 网络模型, 以降低不同路径之间的特征一致性变化。同时, 我们采用一个新型的非三角的、 NAPi- Pi- Pi- Said- NAS 和 Clovelyal NAS 系统系统。。