不确定中断下的在线选择问题 (Online Selection with Uncertain Disruption)

In numerous online selection problems, decision-makers (DMs) must allocate on the fly limited resources to customers with uncertain values. The DM faces the tension between allocating resources to currently observed values and saving them for potentially better, unobserved values in the future. Addressing this tension becomes more demanding if an uncertain disruption occurs while serving customers. Without any disruption, the DM gets access to the capacity information to serve customers throughout the time horizon. However, with uncertain disruption, the DM must act more cautiously due to risk of running out of capacity abruptly or misusing the resources. Motivated by this tension, we introduce the Online Selection with Uncertain Disruption (OS-UD) problem. In OS-UD, a DM sequentially observes n non-negative values drawn from a common distribution and must commit to select or reject each value in real time, without revisiting past values. The disruption is modeled as a Bernoulli random variable with probability p each time DM selects a value. We aim to design an online algorithm that maximizes the expected sum of selected values before a disruption occurs, if any. We evaluate online algorithms using the competitive ratio. Using a quantile-based approach, we devise a non-adaptive single-threshold algorithm that attains a competitive ratio of at least 1-1/e, and an adaptive threshold algorithm characterized by a sequence of non-increasing thresholds that attains an asymptotic competitive ratio of at least 0.745. Both of these results are worst-case optimal within their corresponding class of algorithms.

翻译：在众多在线选择问题中，决策者必须在动态过程中将有限资源分配给具有不确定价值的客户。决策者面临的核心矛盾在于：是将资源分配给当前观测到的价值，还是为未来可能出现的更优价值保留资源。若在服务客户过程中发生不确定的中断，这一矛盾将变得更加突出。若无中断，决策者可在整个时间范围内获取服务客户的容量信息；但面对不确定中断时，由于可能突然耗尽容量或误用资源，决策者必须采取更谨慎的策略。基于此矛盾，我们提出了不确定中断下的在线选择问题。在该问题中，决策者顺序观测从同一分布中抽取的n个非负价值，并需实时决定选择或拒绝每个价值，且不可重新评估过往价值。中断被建模为伯努利随机变量，每次决策者选择价值时以概率p发生。我们的目标是设计一种在线算法，以最大化中断发生前（若发生）所选价值的期望总和。我们采用竞争比评估在线算法性能。通过基于分位数的方法，我们设计了一种非自适应单阈值算法，其竞争比不低于1-1/e；以及一种由非递增阈值序列表征的自适应阈值算法，其渐近竞争比不低于0.745。这两项结果在各自算法类别中均达到最坏情况下的最优性。

相关内容