Understanding the convergence properties of learning dynamics in repeated auctions is a timely and important question in the area of learning in auctions, with numerous applications in, e.g., online advertising markets. This work focuses on repeated first price auctions where bidders with fixed values for the item learn to bid using mean-based algorithms -- a large class of online learning algorithms that include popular no-regret algorithms such as Multiplicative Weights Update and Follow the Perturbed Leader. We completely characterize the learning dynamics of mean-based algorithms, in terms of convergence to a Nash equilibrium of the auction, in two senses: (1) time-average: the fraction of rounds where bidders play a Nash equilibrium approaches 1 in the limit; (2)last-iterate: the mixed strategy profile of bidders approaches a Nash equilibrium in the limit. Specifically, the results depend on the number of bidders with the highest value: - If the number is at least three, the bidding dynamics almost surely converges to a Nash equilibrium of the auction, both in time-average and in last-iterate. - If the number is two, the bidding dynamics almost surely converges to a Nash equilibrium in time-average but not necessarily in last-iterate. - If the number is one, the bidding dynamics may not converge to a Nash equilibrium in time-average nor in last-iterate. Our discovery opens up new possibilities in the study of convergence dynamics of learning algorithms.
翻译:了解反复拍卖中学习动态的趋同特性是一个及时而重要的问题,在拍卖中学习的学习动态领域是一个及时而重要的问题,在网上广告市场等许多应用中,这项工作侧重于重复第一次价格拍卖,对项目有固定价值的投标人学习使用平均算法进行投标 -- -- 大量的在线学习算法,其中包括流行的无回报算法,如多复制性 Weights Uddate和顺从受困领导人。我们从两个角度,将基于平均值的算法与拍卖的纳什平衡趋同为一体:(1) 平均时间:(1) 投标人玩纳什均衡办法1的回合的分数;(2) 最新时间:投标人的混合战略组合在限度内接近纳什均衡。具体地说,结果取决于价值最高的投标人数目:如果数字至少是三个,则投标动态几乎必然会与拍卖的纳什平衡相趋同,无论是在时间平均开放还是最后时间上。 - 如果数字是两个,投标动态几乎肯定地接近纳什均衡,在时间上可能不会达到一个平均结果。