We study stochastic online resource allocation: a decision maker needs to allocate limited resources to stochastically-generated sequentially-arriving requests in order to maximize reward. Motivated by practice, we consider a data-driven setting in which requests are drawn independently from a distribution that is unknown to the decision maker. Online resource allocation and its special cases have been studied extensively in the past, but these previous results crucially and universally rely on a practically-untenable assumption: the total number of requests (the horizon) is known to the decision maker in advance. In many applications, such as revenue management and online advertising, the number of requests can vary widely because of fluctuations in demand or user traffic intensity. In this work, we develop online algorithms that are robust to horizon uncertainty. In sharp contrast to the known-horizon setting, we show that no algorithm can achieve a constant asymptotic competitive ratio that is independent of the horizon uncertainty. We then introduce a novel algorithm that combines dual mirror descent with a carefully-chosen target consumption sequence and prove that it achieves a bounded competitive ratio. Our algorithm is near-optimal in the sense that its competitive ratio attains the optimal rate of growth when the horizon uncertainty grows large.
翻译:我们研究的是随机在线资源分配:决策者需要分配有限资源,用于按顺序提出的按顺序提出的申请,以便最大限度地获得奖励。我们受实践的激励,考虑一种数据驱动环境,在这种环境中,请求的提出独立于决策人所不知道的分布。我们过去曾广泛研究过在线资源分配及其特殊案例,但以往这些结果至关重要,而且普遍依赖于一种几乎无法避免的假设:决策者事先知道请求的总数(地平线),在很多应用中,例如收入管理和在线广告,请求的数量会因需求或用户流量强度的波动而大为不同。在这项工作中,我们开发了对地平线不确定性具有强大影响的在线算法。在与已知的正方平线设置形成鲜明对比时,我们表明任何算法都不可能实现一个与地平线不确定性无关的固定的、无偏重的竞争性竞争比率。我们随后引入了一种新型算法,将双镜血统与仔细选择的目标消费序列结合起来,并证明它达到了一种受约束的竞争比率。在最佳前景中,我们的算法是接近于竞争比率的。