利用强化学习提高神经结构搜索的抽样效率 (Improving the sample-efficiency of neural architecture search with reinforcement learning)

Designing complex architectures has been an essential cogwheel in the revolution deep learning has brought about in the past decade. When solving difficult problems in a datadriven manner, a well-tried approach is to take an architecture discovered by renowned deep learning scientists as a basis (e.g. Inception) and try to apply it to a specific problem. This might be sufficient, but as of now, achieving very high accuracy on a complex or yet unsolved task requires the knowledge of highly-trained deep learning experts. In this work, we would like to contribute to the area of Automated Machine Learning (AutoML), specifically Neural Architecture Search (NAS), which intends to make deep learning methods available for a wider range of society by designing neural topologies automatically. Although several different approaches exist (e.g. gradient-based or evolutionary algorithms), our focus is on one of the most promising research directions, reinforcement learning. In this scenario, a recurrent neural network (controller) is trained to create problem-specific neural network architectures (child). The validation accuracies of the child networks serve as a reward signal for training the controller with reinforcement learning. The basis of our proposed work is Efficient Neural Architecture Search (ENAS), where parameter sharing is applied among the child networks. ENAS, like many other RL-based algorithms, emphasize the learning of child networks as increasing their convergence result in a denser reward signal for the controller, therefore significantly reducing training times. The controller was originally trained with REINFORCE. In our research, we propose to modify this to a more modern and complex algorithm, PPO, which has demonstrated to be faster and more stable in other environments. Then, we briefly discuss and evaluate our results.

翻译：设计复杂的建筑设计是革命深层次学习过程中一个必不可少的现代知识轮廓。在过去十年里,在以数据驱动的方式解决难题时,我们愿意为自动化机器学习(Automal)领域做出贡献。当以数据驱动的方式解决难题时,一个尝试周密的方法是将知名深层学习科学家所发现的建筑作为基础(例如感知)并试图将其应用于特定问题。这也许足够,但现在,在复杂或尚未解决的任务上实现非常高的精确度需要训练有素的深层次学习专家的知识。在这项工作中,我们希望为自动化机器学习(Automil Mail)领域做出贡献。特别是神经结构搜索(NAS)领域,该领域打算通过自动设计神经结构(Integration),为更广泛的社会提供深层学习方法。尽管存在一些不同的方法(例如梯度或演化算法),但我们的侧重点是最有希望的研究方向之一,即强化学习。在这个假设中,一个经常性的神经网络(控制者)被训练为创建针对具体问题的神经网络结构结构结构结构结构(Chil) 。我们之间的校验后,一个更精确的网络作为奖励信号信号,而内更精确的系统网络与不断学习。我们学习的内更精确的系统网络正在学习, 学习, 学习学习。在不断学习。学习的系统化的系统网络正在学习中, 学习。