We study a game between autobidding algorithms that compete in an online advertising platform. Each autobidder is tasked with maximizing its advertiser's total value over multiple rounds of a repeated auction, subject to budget and/or return-on-investment constraints. We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret. Our algorithm uses only bandit feedback and can be used with the first- or second-price auction, as well as with any "intermediate" auction format. Our main result is that when these autobidders play against each other, the resulting expected liquid welfare over all rounds is at least half of the expected optimal liquid welfare achieved by any allocation. This holds whether or not the bidding dynamics converges to an equilibrium and regardless of the correlation structure between advertiser valuations.
翻译:我们研究在网上广告平台上竞争的自动招标算法之间的一种游戏。 每个自动投标人的任务是在预算限制和(或)投资回报的限制下,在多次拍卖中最大限度地增加其广告商的总价值。 我们提出一个基于梯度的学习算法,保证满足所有限制并实现个人遗憾的消失。 我们的算法仅使用土匪反馈,可用于第一或第二价格拍卖,以及任何“中间”拍卖形式。 我们的主要结果是当这些自动竞标人相互竞争时,所有回合的预期液体福利至少是任何分配所预期的最佳液体福利的一半。 这取决于投标动态是否趋于平衡,也取决于广告商估值之间的关联结构。