The recent rise in popularity of Hyperparameter Optimization (HPO) for deep learning has highlighted the role that good hyperparameter (HP) space design can play in training strong models. In turn, designing a good HP space is critically dependent on understanding the role of different HPs. This motivates research on HP Importance (HPI), e.g., with the popular method of functional ANOVA (f-ANOVA). However, the original f-ANOVA formulation is inapplicable to the subspaces most relevant to algorithm designers, such as those defined by top performance. To overcome this problem, we derive a novel formulation of f-ANOVA for arbitrary subspaces and propose an algorithm that uses Pearson divergence (PED) to enable a closed-form computation of HPI. We demonstrate that this new algorithm, dubbed PED-ANOVA, is able to successfully identify important HPs in different subspaces while also being extremely computationally efficient.
翻译:最近,深度学习中超参数优化的流行凸显出好的超参数空间设计在训练强模型中所扮演的角色。反过来,设计一个好的超参数空间严重依赖于理解不同超参数的作用。这激励人们研究超参数重要性(HPI),例如使用流行的函数ANOVA(f-ANOVA)方法。然而,原始的f-ANOVA公式不适用于最相关于算法设计者的子空间,例如由最佳性能定义的子空间。为了解决这个问题,我们推导了一个新的f-ANOVA公式,适用于任意子空间,并提出了一种算法,使用Pearson散度(PED)来实现HPI的闭式计算。我们证明这种新算法(称为PED-ANOVA)能够成功地识别不同子空间中的重要超参数,同时计算效率极高。