Automated algorithm configuration relieves users from tedious, trial-and-error tuning tasks. A popular algorithm configuration tuning paradigm is dynamic algorithm configuration (DAC), in which an agent learns dynamic configuration policies across instances by reinforcement learning (RL). However, in many complex algorithms, there may exist different types of configuration hyperparameters, and such heterogeneity may bring difficulties for classic DAC which uses a single-agent RL policy. In this paper, we aim to address this issue and propose multi-agent DAC (MA-DAC), with one agent working for one type of configuration hyperparameter. MA-DAC formulates the dynamic configuration of a complex algorithm with multiple types of hyperparameters as a contextual multi-agent Markov decision process and solves it by a cooperative multi-agent RL (MARL) algorithm. To instantiate, we apply MA-DAC to a well-known optimization algorithm for multi-objective optimization problems. Experimental results show the effectiveness of MA-DAC in not only achieving superior performance compared with other configuration tuning approaches based on heuristic rules, multi-armed bandits, and single-agent RL, but also being capable of generalizing to different problem classes. Furthermore, we release the environments in this paper as a benchmark for testing MARL algorithms, with the hope of facilitating the application of MARL.
翻译:自动算法配置可以让用户摆脱无聊、试错调调试的任务。流行的算法配置调制范式是动态算法配置(DAC),其中代理商通过强化学习(RL)学习各种实例,学习动态配置政策。然而,在许多复杂的算法中,可能存在不同类型的配置超参数,而这种差异性可能会给使用单一试剂RL政策的经典发援会带来困难。在本文件中,我们的目标是解决这一问题,并提议多试剂发援会(MA-DAC),其中有一个代理商为一种配置超参数工作。MA-DAC(MA-DAC)设计了一种复杂的复杂算法的动态配置,具有多种类型的超参数,作为背景多试剂Markov的决策过程,并通过合作型多试剂RL(MARL)算法解决了这种动态配置政策。在瞬间,我们应用MA-DAC(MA-DAC)对一个众所周知的优化算法来应对多种目的优化问题。实验结果显示MA-DAC不仅实现优异性工作,而且与其他配置调法方法以超常规则为基础,多装制制式的土机,多制制制,MA-DACDAR(MAR)将多种超模(MAR)设计算法作为背景的多级、多级、多级、多级和单一试算法测试(MAL)应用标准,并且作为我们通用标准级标准(MAL)的试算法的升级法测试,也成为一个不同的标准,这是通用标准,在通用的升级法环境的升级的升级的升级。