In a prophet inequality problem, $n$ independent random variables are presented to a gambler one by one. The gambler decides when to stop the sequence and obtains the most recent value as reward. We evaluate a stopping rule by the worst-case ratio between its expected reward and the expectation of the maximum variable. In the classic setting, the order is fixed, and the optimal ratio is known to be 1/2. Three variants of this problem have been extensively studied: the prophet-secretary model, where variables arrive in uniformly random order; the free-order model, where the gambler chooses the arrival order; and the i.i.d. model, where the distributions are all the same, rendering the arrival order irrelevant. Most of the literature assumes that distributions are known to the gambler. Recent work has considered the question of what is achievable when the gambler has access only to a few samples per distribution. Surprisingly, in the fixed-order case, a single sample from each distribution is enough to approximate the optimal ratio, but this is not the case in any of the three variants. We provide a unified proof that for all three variants of the problem, a constant number of samples (independent of n) for each distribution is good enough to approximate the optimal ratios. Prior to our work, this was known to be the case only in the i.i.d. variant. We complement our result showing that our algorithms can be implemented in polynomial time. A key ingredient in our proof is an existential result based on a minimax argument, which states that there must exist an algorithm that attains the optimal ratio and does not rely on the knowledge of the upper tail of the distributions. A second key ingredient is a refined sample-based version of a decomposition of the instance into "small" and "large" variables, first introduced by Liu et al. [EC'21].
翻译:暂无翻译