The performance of deep (reinforcement) learning systems crucially depends on the choice of hyperparameters. Their tuning is notoriously expensive, typically requiring an iterative training process to run for numerous steps to convergence. Traditional tuning algorithms only consider the final performance of hyperparameters acquired after many expensive iterations and ignore intermediate information from earlier training steps. In this paper, we present a Bayesian optimization (BO) approach which exploits the iterative structure of learning algorithms for efficient hyperparameter tuning. We propose to learn an evaluation function compressing learning progress at any stage of the training process into a single numeric score according to both training success and stability. Our BO framework is then balancing the benefit of assessing a hyperparameter setting over additional training steps against their computation cost. We further increase model efficiency by selectively including scores from different training steps for any evaluated hyperparameter set. We demonstrate the efficiency of our algorithm by tuning hyperparameters for the training of deep reinforcement learning agents and convolutional neural networks. Our algorithm outperforms all existing baselines in identifying optimal hyperparameters in minimal time.
翻译:深层(加强)学习系统的性能,关键取决于对超参数的选择。它们的调试非常昂贵,通常需要反复的培训过程,才能运行许多步调趋同的步骤。传统的调试算法只考虑在许多昂贵的迭代后获得的超参数的最后性能,而忽略了先前培训步骤的中间信息。在本文中,我们介绍了一种巴耶斯优化(BO)法,它利用学习算法的迭代结构进行高效超光谱调。我们提议学习一种评价职能,根据培训成功和稳定性,将培训过程任何阶段的学习进度压缩成一个单数分。我们的BO框架正在平衡评估超单数计设置额外培训步骤与计算成本之间的效益。我们进一步提高模型效率,有选择地包括从任何经过评估的超光谱设置的不同培训步骤中得分数。我们通过调整超分法来培训深度加固学习剂和革命神经网络,以显示我们的算法效率。我们的算法比所有现有基线都比最低时间确定最佳的超光度。