Neural architecture search (NAS) with an accuracy predictor that predicts the accuracy of candidate architectures has drawn increasing attention due to its simplicity and effectiveness. Previous works usually employ neural network-based predictors which require more delicate design and are easy to overfit. Considering that most architectures are represented as sequences of discrete symbols which are more like tabular data and preferred by non-neural predictors, in this paper, we study an alternative approach which uses non-neural model for accuracy prediction. Specifically, as decision tree based models can better handle tabular data, we leverage gradient boosting decision tree (GBDT) as the predictor for NAS. We demonstrate that the GBDT predictor can achieve comparable (if not better) prediction accuracy than neural network based predictors. Moreover, considering that a compact search space can ease the search process, we propose to prune the search space gradually according to important features derived from GBDT. In this way, NAS can be performed by first pruning the search space and then searching a neural architecture, which is more efficient and effective. Experiments on NASBench-101 and ImageNet demonstrate the effectiveness of using GBDT as predictor for NAS: (1) On NASBench-101, it is 22x, 8x, and 6x more sample efficient than random search, regularized evolution, and Monte Carlo Tree Search (MCTS) in finding the global optimum; (2) It achieves 24.2% top-1 error rate on ImageNet, and further achieves 23.4% top-1 error rate on ImageNet when enhanced with search space pruning. Code is provided at https://github.com/renqianluo/GBDT-NAS.
翻译:精度预测仪的神经架构搜索 (NAS), 其精确性预测值预测了候选结构的准确性, 从而引起人们越来越多的关注, 因为它的简单性和有效性。 以前的工程通常使用神经网络预测器, 需要更精细的设计, 并且容易过度使用。 考虑到大多数建筑都是离散符号的序列, 更像表格数据, 更像是非神经预测器所偏爱的离散符号, 本文中, 我们研究一种使用非神经模型进行精确预测的非神经模型的替代方法。 具体地说, 由于基于决策树的模型可以更好地处理表格数据, 我们利用基于网络的预测器的梯度推升决定树( GBDT ) 作为NAS的预测器。 我们证明, GBDT预测器的预测准确性( 如果不是更好的话) 。 此外, 考虑到一个紧凑搜索空间可以方便搜索进程, 我们建议根据GBDT产生的重要特征逐步缩小搜索空间。 这样, NAS可以首先对搜索空间进行调整, 然后搜索一个更有效率和更有效的神经结构结构结构。 。 。 在NASB-101和图像SDDSO 上进行实验, 23DS 的搜索率比其高级搜索率要提高。