Online advertising has recently grown into a highly competitive and complex multi-billion-dollar industry, with advertisers bidding for ad slots at large scales and high frequencies. This has resulted in a growing need for efficient "auto-bidding" algorithms that determine the bids for incoming queries to maximize advertisers' targets subject to their specified constraints. This work explores efficient online algorithms for a single value-maximizing advertiser under an increasingly popular constraint: Return-on-Spend (RoS). We quantify efficiency in terms of regret relative to the optimal algorithm, which knows all queries a priori. We contribute a simple online algorithm that achieves near-optimal regret in expectation while always respecting the specified RoS constraint when the input sequence of queries are i.i.d. samples from some distribution. We also integrate our results with the previous work of Balseiro, Lu, and Mirrokni [BLM20] to achieve near-optimal regret while respecting both RoS and fixed budget constraints. Our algorithm follows the primal-dual framework and uses online mirror descent (OMD) for the dual updates. However, we need to use a non-canonical setup of OMD, and therefore the classic low-regret guarantee of OMD, which is for the adversarial setting in online learning, no longer holds. Nonetheless, in our case and more generally where low-regret dynamics are applied in algorithm design, the gradients encountered by OMD can be far from adversarial but influenced by our algorithmic choices. We exploit this key insight to show our OMD setup achieves low regret in the realm of our algorithm.
翻译:网上广告最近发展成为一个高度竞争和复杂的数十亿美元行业,广告商在大尺度和高频率的广告中竞拍广告额。这导致越来越需要高效的“自动招标”算法,以决定进入查询的投标,在特定限制下最大限度地扩大广告商的目标。这项工作还探索了单一价值最大化广告商的高效在线算法,在日益流行的限制下,单一价值最大化广告商的在线算法:“回归在线”和固定预算限制(ROS)。我们量化了相对于最佳算法的效率,而最佳算法是所有查询的先验。我们贡献了一个简单的在线算法,实现了接近最佳的评分率,同时在预期中实现了接近最佳的评分率,同时总是遵守了规定的ROS限制,同时当查询的输入序列为 i.d.d. 某些发行的样本时,我们还要将我们的结果与Balseiro、Lu和Mirrokni[BLM20] 以往的工作结合起来,以便在尊重ROS和固定预算限制的情况下实现接近最佳的遗憾。我们的初始算法框架,我们使用在线反向镜底基值的亚值设置,因此在Oral-ral-de 中需要更长期地设置。在Oreval-de,因此在Oreval-de 。在Oral-de 上,在Oralevalevalevalevaleval 上要更长期的设置一个更低的设置,因此在O-de 。