A typical assumption in supervised machine learning is that the train (source) and test (target) datasets follow completely the same distribution. This assumption is, however, often violated in uncertain real-world applications, which motivates the study of learning under covariate shift. In this setting, the naive use of adaptive hyperparameter optimization methods such as Bayesian optimization does not work as desired since it does not address the distributional shift among different datasets. In this work, we consider a novel hyperparameter optimization problem under the multi-source covariate shift whose goal is to find the optimal hyperparameters for a target task of interest using only unlabeled data in a target task and labeled data in multiple source tasks. To conduct efficient hyperparameter optimization for the target task, it is essential to estimate the target objective using only the available information. To this end, we construct the variance reduced estimator that unbiasedly approximates the target objective with a desirable variance property. Building on the proposed estimator, we provide a general and tractable hyperparameter optimization procedure, which works preferably in our setting with a no-regret guarantee. The experiments demonstrate that the proposed framework broadens the applications of automated hyperparameter optimization.
翻译:受监督的机器学习的一个典型假设是火车(源)和测试(目标)数据集的分布完全相同。然而,这一假设经常在不确定的现实世界应用程序中被违反,这促使在共变式转变下进行学习研究。在这一背景下,天真地使用适应性超参数优化方法(如巴伊西亚优化)并不如愿,因为它没有解决不同数据集之间的分布变化。在这项工作中,我们认为在多源共变式转换下出现了一个新的超参数优化问题,其目的是要找到最佳的超参数,用于目标利益任务的最佳超参数,只使用目标任务中未加标签的数据和多源任务中的标签数据。要对目标任务进行有效的超光度优化,就必须仅利用现有信息来估计目标目标目标目标目标。我们为此构建了不偏袒地与目标目标目标相近且有适当差异属性的偏差估计器。在拟议的估算器上,我们提供了一种通用和可移动的超光度超参数优化程序,最好是在我们设定目标时使用不重复的优化框架。