The problem of sequentially maximizing the expectation of a function seeks to maximize the expected value of a function of interest without having direct control on its features. Instead, the distribution of such features depends on a given context and an action taken by an agent. In contrast to Bayesian optimization, the arguments of the function are not under agent's control, but are indirectly determined by the agent's action based on a given context. If the information of the features is to be included in the maximization problem, the full conditional distribution of such features, rather than its expectation only, needs to be accounted for. Furthermore, the function is itself unknown, only counting with noisy observations of such function, and potentially requiring the use of unmatched data sets. We propose a novel algorithm for the aforementioned problem which takes into consideration the uncertainty derived from the estimation of both the conditional distribution of the features and the unknown function, by modeling the former as a Bayesian conditional mean embedding and the latter as a Gaussian process. Our algorithm empirically outperforms the current state-of-the-art algorithm in the experiments conducted.
翻译:连续实现对函数期望最大化的问题寻求在不直接控制功能特性的情况下最大限度地实现利益函数的预期值。相反,这些特性的分布取决于特定背景和代理人采取的行动。与巴伊西亚优化相比,函数的论据并不在代理人的控制之下,而是间接地由代理人根据特定背景采取的行动决定。如果特征信息要包含在最大化问题中,则需要说明这些特征的完全有条件分布,而不仅仅是其预期。此外,该功能本身并不为人所知,仅仅计及对此类功能的噪音观测,并可能要求使用不匹配的数据集。我们为上述问题提出了一个新的算法,其中考虑到根据对特征有条件分布的估计和未知功能所产生的不确定性,将前者作为巴伊西有条件的嵌入,而后者作为高斯进程。我们的算法在经验上超越了实验中目前的最新算法。