Hyperparamter tuning is one of the the most time-consuming parts in machine learning: The performance of a large number of different hyperparameter settings has to be evaluated to find the best one. Although modern optimization algorithms exist that minimize the number of evaluations needed, the evaluation of a single setting is still expensive: Using a resampling technique, the machine learning method has to be fitted a fixed number of $K$ times on different training data sets. As an estimator for the performance of the setting the respective mean value of the $K$ fits is used. Many hyperparameter settings could be discarded after less than $K$ resampling iterations, because they already are clearly inferior to high performing settings. However, in practice, the resampling is often performed until the very end, wasting a lot of computational effort. We propose to use a sequential testing procedure to minimize the number of resampling iterations to detect inferior parameter setting. To do so, we first analyze the distribution of resampling errors, we will find out, that a log-normal distribution is promising. Afterwards, we build a sequential testing procedure assuming this distribution. This sequential test procedure is utilized within a random search algorithm. We compare a standard random search with our enhanced sequential random search in some realistic data situation. It can be shown that the sequential random search is able to find comparably good hyperparameter settings, however, the computational time needed to find those settings is roughly halved.
翻译:超立方体调制是机器学习中最费时间的部分之一: 需要评估大量不同超参数设置的性能才能找到最佳的。 尽管存在现代优化算法,可以最大限度地减少所需评价的数量, 但单一设置的评价仍然费用高昂: 使用重新抽样技术, 机器学习方法必须在不同的培训数据集中安装固定数量为K美元乘数的固定数量。 作为用于设定各自平均价值的“ $K” 比例的测算器, 许多超参数设置在重塑复制值后可能会被丢弃, 因为它们已经明显低于高性能设置。 然而, 在实践中, 重新采集往往一直到最后, 浪费大量计算努力。 我们提议使用一个序列测试程序来尽量减少重标次数, 以检测低度参数设置。 要做到这一点, 我们首先分析重现错误的分布, 我们就会发现, 逻辑正常的分布是很有希望的。 之后, 我们建立一个顺序测试程序, 假设这种随机排序, 我们的顺序测试程序是使用一个随机的搜索程序。