One of the most critical problems in machine learning is HyperParameter Optimization (HPO), since choice of hyperparameters has a significant impact on final model performance. Although there are many HPO algorithms, they either have no theoretical guarantees or require strong assumptions. To this end, we introduce BLiE -- a Lipschitz-bandit-based algorithm for HPO that only assumes Lipschitz continuity of the objective function. BLiE exploits the landscape of the objective function to adaptively search over the hyperparameter space. Theoretically, we show that $(i)$ BLiE finds an $\epsilon$-optimal hyperparameter with $O \left( \frac{1}{\epsilon} \right)^{d_z + \beta}$ total budgets, where $d_z$ and $\beta$ are problem intrinsic; $(ii)$ BLiE is highly parallelizable. Empirically, we demonstrate that BLiE outperforms the state-of-the-art HPO algorithms on benchmark tasks. We also apply BLiE to search for noise schedule of diffusion models. Comparison with the default schedule shows that BLiE schedule greatly improves the sampling speed.
翻译:机器学习领域中最重要的问题之一是超参数优化(HPO),因为超参数选择对最终模型性能有很大影响。虽然有许多HPO算法,但它们要么没有理论保证,要么需要强大的假设。为此,我们介绍了BLiE——一种基于Lipschitz-bandit的HPO算法,它只假设目标函数具有Lipschitz连续性。BLiE利用目标函数的景观来自适应地搜索超参数空间。从理论上讲,我们展示了(i)BLiE找到具有$\epsilon$-optimal超参数的总预算为$O\left(\frac{1}{\epsilon}\right)^{d_z+\beta}$,其中$d_z$和$\beta$是问题内在的;(ii)BLiE高度可并行化。实证方面,我们证明了BLiE在基准任务上优于最先进的HPO算法。我们还将BLiE应用于搜索扩散模型的噪声时间表。与默认时间表相比,BLiE时间表极大地提高了采样速度。