Hyperparameter optimization (HPO) is increasingly used to automatically tune the predictive performance (e.g., accuracy) of machine learning models. However, in a plethora of real-world applications, accuracy is only one of the multiple -- often conflicting -- performance criteria, necessitating the adoption of a multi-objective (MO) perspective. While the literature on MO optimization is rich, few prior studies have focused on HPO. In this paper, we propose algorithms that extend asynchronous successive halving (ASHA) to the MO setting. Considering multiple evaluation metrics, we assess the performance of these methods on three real world tasks: (i) Neural architecture search, (ii) algorithmic fairness and (iii) language model optimization. Our empirical analysis shows that MO ASHA enables to perform MO HPO at scale. Further, we observe that that taking the entire Pareto front into account for candidate selection consistently outperforms multi-fidelity HPO based on MO scalarization in terms of wall-clock time. Our algorithms (to be open-sourced) establish new baselines for future research in the area.
翻译:超参数优化(HPO)日益被用来自动调整机器学习模型的预测性能(例如准确性),然而,在大量真实世界应用中,准确性只是多种 -- -- 常常相互冲突 -- -- 性能标准之一,需要采用多目标(MO)观点。虽然MO优化方面的文献丰富,但以前很少有研究侧重于HPO。在本文中,我们提出了将无同步连续连续减半(ASHA)扩展至MO设置的算法。考虑到多种评价指标,我们评估了三种真实世界任务(一)神经结构搜索,(二)算法公正和(三)语言模型优化的这些方法的绩效。我们的经验分析表明,MOASHA能够大规模地执行MO HPO。此外,我们注意到,将整个Pareto公司在选择候选人时的考虑始终超越基于墙时速的MO 斜度化的多纤维化 HPO。我们的算法(要开源)为该地区未来研究建立了新的基线。