Bayesian optimization (BO) is a powerful sequential optimization approach for seeking the global optimum of black-box functions for sample efficiency purposes. Evaluations of black-box functions can be expensive, rendering reduced use of labeled data desirable. For the first time, we introduce a teacher-student model, called $\texttt{TSBO}$, to enable semi-supervised learning that can make use of large amounts of cheaply generated unlabeled data under the context of BO to enhance the generalization of data query models. Our teacher-student model is uncertainty-aware and offers a practical mechanism for leveraging the pseudo labels generated for unlabeled data while dealing with the involved risk. We show that the selection of unlabeled data is key to $\texttt{TSBO}$. We optimize unlabeled data sampling by generating unlabeled data from a dynamically fitted extreme value distribution or a parameterized sampling distribution learned by minimizing the student feedback. $\texttt{TSBO}$ is capable of operating in a learned latent space with reduced dimensionality, providing scalability to high-dimensional problems. $\texttt{TSBO}$ demonstrates the significant sample efficiency in several global optimization tasks under tight labeled data budgets.
翻译:暂无翻译