Bayesian optimization is a popular algorithm for sequential optimization of a latent objective function when sampling from the objective is costly. The search path of the algorithm is governed by the acquisition function, which defines the agent's search strategy. Conceptually, the acquisition function characterizes how the optimizer balances exploration and exploitation when searching for the optimum of the latent objective. In this paper, we explore the inverse problem of Bayesian optimization; we seek to estimate the agent's latent acquisition function based on observed search paths. We introduce a probabilistic solution framework for the inverse problem which provides a principled framework to quantify both the variability with which the agent performs the optimization task as well as the uncertainty around their estimated acquisition function. We illustrate our methods by analyzing human behavior from an experiment which was designed to force subjects to balance exploration and exploitation in search of an invisible target location. We find that while most subjects demonstrate clear trends in their search behavior, there is significant variation around these trends from round to round. A wide range of search strategies are exhibited across the subjects in our study, but upper confidence bound acquisition functions offer the best fit for the majority of subjects. Finally, some subjects do not map well to any of the acquisition functions we initially consider; these subjects tend to exhibit exploration preferences beyond that of standard acquisition functions to capture. Guided by the model discrepancies, we augment the candidate acquisition functions to yield a superior fit to the human behavior in this task.
翻译:在从目标取样费用高昂时,Bayesian优化是一种对潜在目标功能进行顺序优化的流行算法,在从目标取样时,这种潜在目标功能是一种对潜在目标功能进行顺序优化的流行算法。算法的搜索路径由获取功能决定,该功能界定了代理人的搜索策略。概念上,获取功能是优化者在寻找潜在目标的最佳位置时如何平衡勘探和开发的典型。在本文中,我们探讨了Bayesian优化的反面问题;我们试图根据观察到的搜索路径来估计该代理人的潜在获取功能。我们为反面问题引入了一个概率性解决方案框架,它提供了一个原则性框架,既可以量化代理人执行优化任务的变异性,又可以量化其估计获取功能的不确定性。我们用实验方法来分析人类行为,目的是迫使主体在寻找隐性目标位置时平衡勘探和开发。我们发现,虽然大多数主体在搜索行为中表现出明显的趋势,但围绕这些趋势从周到周而周而周而周而周而周而周而周而周而进行。我们展示了广泛的搜索策略,但高信任约束获取功能为大多数主体提供了最佳的最适合性获取功能。最后,一些主体在获取功能上,我们并不把获取功能扩大到任何更高的获取功能。