We study a game between autobidding algorithms that compete in an online advertising platform. Each autobidder is tasked with maximizing its advertiser's total value over multiple rounds of a repeated auction, subject to budget and/or return-on-investment constraints. We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret. Our algorithm uses only bandit feedback and can be used with the first- or second-price auction, as well as with any "intermediate" auction format. Our main result is that when these autobidders play against each other, the resulting expected liquid welfare over all rounds is at least half of the expected optimal liquid welfare achieved by any allocation. This holds whether or not the bidding dynamics converges to an equilibrium and regardless of the correlation structure between advertiser valuations.
翻译:我们研究了一种在在线广告平台上竞争的自动出价算法之间的游戏。每个自动出价者的任务是在多次重复拍卖中最大化其广告主的总价值,同时受到预算和/或投资回报限制的约束。我们提出了一种基于梯度的学习算法,这种算法保证满足所有约束条件,并实现了逐步减少的个体遗憾值。我们的算法仅使用匿名反馈,并且可以与第一或第二价格拍卖以及任何“中介”拍卖格式一起使用。我们的主要结果是,当这些自动出价者彼此竞争时,产生的所有回合的期望流动效用至少是任何分配实现的预期最优流动效用的一半。无论出价动态是否收敛到均衡状态以及广告主估值之间的相关结构如何,这仍然成立。