In this paper, we propose multi-agent automated machine learning (MA2ML) with the aim to effectively handle joint optimization of modules in automated machine learning (AutoML). MA2ML takes each machine learning module, such as data augmentation (AUG), neural architecture search (NAS), or hyper-parameters (HPO), as an agent and the final performance as the reward, to formulate a multi-agent reinforcement learning problem. MA2ML explicitly assigns credit to each agent according to its marginal contribution to enhance cooperation among modules, and incorporates off-policy learning to improve search efficiency. Theoretically, MA2ML guarantees monotonic improvement of joint optimization. Extensive experiments show that MA2ML yields the state-of-the-art top-1 accuracy on ImageNet under constraints of computational cost, e.g., $79.7\%/80.5\%$ with FLOPs fewer than 600M/800M. Extensive ablation studies verify the benefits of credit assignment and off-policy learning of MA2ML.
翻译:在本文中,我们提议多试剂自动机学习(MA2ML),目的是有效地处理自动机学习(Automal)中模块的联合优化问题。MA2ML将每个机器学习模块,如数据扩增(AUG)、神经结构搜索(NAS)或超参数(HPO)作为代理和最后性能作为奖赏,以形成多剂强化学习问题。MA2ML根据其微小贡献明确为每个代理提供信用,以加强各模块之间的合作,并纳入非政策性学习,以提高搜索效率。理论上,MA2ML保证联合优化的单体改进。广泛的实验显示,MA2ML在计算成本的限制下,如79.7 ⁇ /80.5 $,而FLOP不到600M/800M. 广泛化研究核实了MA2ML的信用分配和非政策学习的好处。