具有神经建筑搜索非神经模型的准确预测 (Accuracy Prediction with Non-neural Model for Neural Architecture Search)

Neural architecture search (NAS) with an accuracy predictor that predicts the accuracy of candidate architectures has drawn increasing attention due to its simplicity and effectiveness. Previous works usually employ neural network-based predictors which require more delicate design and are easy to overfit. Considering that most architectures are represented as sequences of discrete symbols which are more like tabular data and preferred by non-neural predictors, in this paper, we study an alternative approach which uses non-neural model for accuracy prediction. Specifically, as decision tree based models can better handle tabular data, we leverage gradient boosting decision tree (GBDT) as the predictor for NAS. We demonstrate that the GBDT predictor can achieve comparable (if not better) prediction accuracy than neural network based predictors. Moreover, considering that a compact search space can ease the search process, we propose to prune the search space gradually according to important features derived from GBDT. In this way, NAS can be performed by first pruning the search space and then searching a neural architecture, which is more efficient and effective. Experiments on NASBench-101 and ImageNet demonstrate the effectiveness of using GBDT as predictor for NAS: (1) On NASBench-101, it is 22x, 8x, and 6x more sample efficient than random search, regularized evolution, and Monte Carlo Tree Search (MCTS) in finding the global optimum; (2) It achieves 24.2% top-1 error rate on ImageNet, and further achieves 23.4% top-1 error rate on ImageNet when enhanced with search space pruning. Code is provided in the supplementary materials.

翻译：精度预测器,预测候选结构的准确性,神经结构搜索(NAS)因其简单性和有效性而引起越来越多的关注。先前的工程通常使用神经网络预测器,这些预测器需要更精细的设计,而且容易过度装配。考虑到大多数结构是离散符号的序列,这些符号更像表格数据,更像是非神经预测器所偏爱的,本文中我们研究一种使用非神经模型进行准确性预测的替代方法。具体地说,由于基于决策树的模型可以更好地处理表格数据,我们利用图像梯度推动决定树(GBDT)作为NAS的预测器。我们证明,基于神经网络的神经网络预测器可以(如果不是更好的话)实现可比的预测准确性。此外,考虑到一个紧凑搜索空间可以方便搜索过程,我们建议根据GBDT的重要特征逐步缩短搜索空间。这样,NAS可以首先对搜索空间进行调整,然后搜索一个更高效和高效的神经结构。在NAS-101和图像SOV On 上进行实验,在NAS-101 23 的搜索中实现更高的搜索率。