In the adaptive ProbeMax problem, given a collection of mutually-independent random variables $X_1, \ldots, X_n$, our goal is to design an adaptive probing policy for sequentially sampling at most $k$ of these variables, with the objective of maximizing the expected maximum value sampled. In spite of its stylized formulation, this setting captures numerous technical hurdles inherent to stochastic optimization, related to both information structure and efficient computation. For these reasons, adaptive ProbeMax has served as a test bed for a multitude of algorithmic methods, and concurrently as a popular teaching tool in courses and tutorials dedicated to recent trends in optimization under uncertainty. The main contribution of this paper consists in proposing a novel method for upper-bounding the expected maximum reward of optimal adaptive probing policies, based on a simple min-max problem. Equipped with this method, we devise purely-combinatorial algorithms for deterministically computing feasible sets whose vicinity to the adaptive optimum is analyzed through prophet inequality ideas. Consequently, this approach allows us to establish improved constructive adaptivity gaps for the ProbeMax problem in its broadest form, where $X_1, \ldots, X_n$ are general random variables, making further advancements when $X_1, \ldots, X_n$ are continuous.
翻译:在适应性Probemax问题中,考虑到一系列相互独立的随机变量 $X_1,\ldots,X_n$,我们的目标是设计一个适应性调查政策,以这些变量中最多以美元进行顺序抽样,目标是最大限度地实现预期的最大抽样值。尽管这个设置具有系统化的配方,但它捕捉了与随机优化相关的许多技术障碍,既与信息结构相关,也与高效计算相关。出于这些原因,适应性Probemax成为多种算法方法的测试床,同时也是课程和教学中针对不确定性中最新优化趋势的流行教学工具。 本文的主要贡献在于提出一个新的方法,根据简单的微轴问题,将最佳适应性激励政策的最高奖励上限设定为上限。 使用这种方法,我们设计了与适应性最佳方法相近的确定性计算组合的纯合成算法。 因此,这一方法使我们得以在课程和辅导性教学中找到更好的建设性适应性差距,当 X1 美元-xxxxx 的任意变数正在进一步形成其总体变数时, X1 $_xxxxx 问题正在进一步形成。