Today deep learning is widely used for building software. A software engineering problem with deep learning is that finding an appropriate convolutional neural network (CNN) model for the task can be a challenge for developers. Recent work on AutoML, more precisely neural architecture search (NAS), embodied by tools like Auto-Keras aims to solve this problem by essentially viewing it as a search problem where the starting point is a default CNN model, and mutation of this CNN model allows exploration of the space of CNN models to find a CNN model that will work best for the problem. These works have had significant success in producing high-accuracy CNN models. There are two problems, however. First, NAS can be very costly, often taking several hours to complete. Second, CNN models produced by NAS can be very complex that makes it harder to understand them and costlier to train them. We propose a novel approach for NAS, where instead of starting from a default CNN model, the initial model is selected from a repository of models extracted from GitHub. The intuition being that developers solving a similar problem may have developed a better starting point compared to the default model. We also analyze common layer patterns of CNN models in the wild to understand changes that the developers make to improve their models. Our approach uses commonly occurring changes as mutation operators in NAS. We have extended Auto-Keras to implement our approach. Our evaluation using 8 top voted problems from Kaggle for tasks including image classification and image regression shows that given the same search time, without loss of accuracy, Manas produces models with 42.9% to 99.6% fewer number of parameters than Auto-Keras' models. Benchmarked on GPU, Manas' models train 30.3% to 641.6% faster than Auto-Keras' models.
翻译:641.6 今天深层学习被广泛用于建设软件。深层学习的软件工程问题在于找到适合该任务的神经神经网络(CNN)模型对于开发者来说是一个挑战。最近关于Autom-Keras等工具所体现的Auto-Keras的工程,更精确的神经结构搜索(NAS)旨在解决这个问题,基本上将它视为一个搜索问题,其起点是默认的CNN模式,而这一CNN模式的突变使得能够探索CNN模型的空间,找到一个能最好地解决问题的CNN模型。这些工程在制作高精度CNN模型方面已经取得了巨大的成功。然而,有两个问题。首先,NAS的工程成本可能非常高,往往要花几个小时才能完成。第二,NAS制作的CNN模型可能非常复杂,因此更难理解这些模型,而且要花费更多成本来培训它们。我们为NAS提出了一个全新的方法,而不是从默认的CNN模式开始,而最初的模型是从从一个存储的模型中选择一个比GitHUNR的更快速的模型。直观设计者可能比G-K的模型在创建一个更好的起始点,而没有比WAND的模型。我们在运行模型中使用了30级的模型的模型的模型中,我们用普通的模型,我们用普通的模型来进行着的模型的模型的模型。我们用普通的模型的模型的模型,我们用到不断的模型来理解的模型的模型,我们使用。我们用到不断的模型,我们用来去的模型的模型,我们用来去的模型的模型的模型,我们用到不断的模型,我们用到不断的模型的模型的模型的模型的模型的模型,我们用到正在的模型,我们用在运行的模型,我们用到不断的模型的模型,我们用到正在的模型的模型,我们用的模型,我们用到不断的模型,我们用到不断的模型的模型,我们使用。