地平线不确定性下的在线资源分配 (Online Resource Allocation under Horizon Uncertainty)

We study stochastic online resource allocation: a decision maker needs to allocate limited resources to stochastically-generated sequentially-arriving requests in order to maximize reward. At each time step, requests are drawn independently from a distribution that is unknown to the decision maker. Online resource allocation and its special cases have been studied extensively in the past, but prior results crucially and universally rely on the strong assumption that the total number of requests (the horizon) is known to the decision maker in advance. In many applications, such as revenue management and online advertising, the number of requests can vary widely because of fluctuations in demand or user traffic intensity. In this work, we develop online algorithms that are robust to horizon uncertainty. In sharp contrast to the known-horizon setting, no algorithm can achieve even a constant asymptotic competitive ratio that is independent of the horizon uncertainty. We introduce a novel generalization of dual mirror descent which allows the decision maker to specify a schedule of time-varying target consumption rates, and prove corresponding performance guarantees. We go on to give a fast algorithm for computing a schedule of target consumption rates that leads to near-optimal performance in the unknown-horizon setting. In particular, our competitive ratio attains the optimal rate of growth (up to logarithmic factors) as the horizon uncertainty grows large. Finally, we also provide a way to incorporate machine-learned predictions about the horizon which interpolates between the known and unknown horizon settings.

翻译：我们研究的是随机在线资源分配:决策者需要分配有限的资源,用于按顺序提出的按顺序提出的要求,以便最大限度地获得奖励。在每一个时间步骤中,请求都是与决策者所不知道的分布分开的。过去,我们曾广泛研究过在线资源分配及其特殊案例,但以往的结果至关重要,而且普遍依赖于一个强有力的假设,即决策者事先知道请求的总数(地平线),在许多应用程序中,例如收入管理和在线广告,由于需求或用户流量强度的波动,请求的数量可能大不相同。在这项工作中,我们开发的在线算法对地平线不确定性很有利。与已知的正平线设置截然不同的是,任何算法都无法实现一个甚至恒定的、与地平线竞争的竞争性竞争比率,而这一比率与地平线不确定性无关。我们引入了一种新型的双重镜底线,使决策者能够指定一个时间变化的目标消费率时间表,并证明相应的业绩保证。我们继续快速计算一个目标消费率表,从而导致接近地平面的不确定性。在最远平面上实现我们所处的正平面的正平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平。