Neural architecture search (NAS) has shown great success in the automatic design of deep neural networks (DNNs). However, the best way to use data to search network architectures is still unclear and under exploration. Previous work has analyzed the necessity of having ground-truth labels in NAS and inspired broad interest. In this work, we take a further step to question whether real data is necessary for NAS to be effective. The answer to this question is important for applications with limited amount of accessible data, and can help people improve NAS by leveraging the extra flexibility of data generation. To explore if NAS needs real data, we construct three types of unreal datasets using: 1) randomly labeled real images; 2) generated images and labels; and 3) generated Gaussian noise with random labels. These datasets facilitate to analyze the generalization and expressivity of the searched architectures. We study the performance of architectures searched on these constructed datasets using popular differentiable NAS methods. Extensive experiments on CIFAR, ImageNet and CheXpert show that the searched architectures can achieve promising results compared with those derived from the conventional NAS pipeline with real labeled data, suggesting the feasibility of performing NAS with unreal data.
翻译:在深神经网络的自动设计中,神经结构搜索(NAS)显示在深度神经网络(DNNS)的自动设计上取得了巨大成功。然而,使用数据搜索网络结构的最佳方法仍然不明确,而且正在探索中。先前的工作分析了在NAS中设置地面真实标签的必要性,并激发了广泛的兴趣。在这项工作中,我们进一步质疑是否有必要提供真正数据,使NAS能够有效。对于使用有限可访问数据的应用来说,这个问题的答案很重要,并且能够通过利用数据生成的额外灵活性帮助人们改进NAS。如果NAS需要真实数据,我们建造三种非真实数据集的方式仍然不清楚和正在探索之中。我们使用:1)随机标出真实图像和标签;2)生成图像和标签;3)以随机标签生成高斯噪音。这些数据集有助于分析搜索结构的概括性和清晰度。我们用流行的可获取的NAS方法对在这些已建数据集上搜索的建筑结构的性能进行了研究。在CIFAR、图像网络和CheXpert上进行的广泛实验表明,搜索的建筑结构可以实现有希望的结果,而用常规NAS输管中的数据则显示不具有真实的可行性。