Deep learning models are full of hyperparameters, which are set manually before the learning process can start. To find the best configuration for these hyperparameters in such a high dimensional space, with time-consuming and expensive model training / validation, is not a trivial challenge. Bayesian optimization is a powerful tool for the joint optimization of hyperparameters, efficiently trading off exploration and exploitation of the hyperparameter space. In this paper, we discuss Bayesian hyperparameter optimization, including hyperparameter optimization, Bayesian optimization, and Gaussian processes. We also review BoTorch, GPyTorch and Ax, the new open-source frameworks that we use for Bayesian optimization, Gaussian process inference and adaptive experimentation, respectively. For experimentation, we apply Bayesian hyperparameter optimization, for optimizing group weights, to weighted group pooling, which couples unsupervised tiered graph autoencoders learning and supervised graph prediction learning for molecular graphs. We find that Ax, BoTorch and GPyTorch together provide a simple-to-use but powerful framework for Bayesian hyperparameter optimization, using Ax's high-level API that constructs and runs a full optimization loop and returns the best hyperparameter configuration.
翻译:深层学习模型充满了超参数,这些模型是在学习过程开始之前手工设置的。 在如此高的维度空间找到这些超参数的最佳配置,同时进行耗时和昂贵的模型培训/验证,这不是一个微不足道的挑战。 贝叶斯优化是联合优化超参数、高效交换对超参数空间的探索和利用的强大工具。 在本文中, 我们讨论巴伊西亚超参数优化, 包括超参数优化、 巴耶斯优化和高斯进程。 我们还审查了博托奇、 吉普切和阿克斯, 我们分别用于巴耶斯优化、 高斯进程推断和适应性实验的新的开放源框架。 在实验中, 我们应用巴伊斯超参数优化, 优化群体重量, 以及加权组合, 夫妇们可以使用不严密的分层图形自动分析器学习和监督用于分子图的图表预测学习。 我们发现, Ax、 博托尔奇和GPyTorch共同提供了一个简单但强大的开放源框架, 分别用于巴耶斯优化、 高级平面结构, 和高级平面结构, 使用最高平面结构, 进行高级平整。