Applications of combinatorial auctions (CA) as market mechanisms are prevalent in practice, yet their Bayesian Nash equilibria (BNE) remain poorly understood. Analytical solutions are known only for a few cases where the problem can be reformulated as a tractable partial differential equation (PDE). In the general case, finding BNE is known to be computationally hard. Previous work on numerical computation of BNE in auctions has relied either on solving such PDEs explicitly, calculating pointwise best-responses in strategy space, or iteratively solving restricted subgames. In this study, we present a generic yet scalable alternative multi-agent equilibrium learning method that represents strategies as neural networks and applies policy iteration based on gradient dynamics in self-play. Most auctions are ex-post nondifferentiable, so gradients may be unavailable or misleading, and we rely on suitable pseudogradient estimates instead. Although it is well-known that gradient dynamics cannot guarantee convergence to NE in general, we observe fast and robust convergence to approximate BNE in a wide variety of auctions and present a sufficient condition for convergence
翻译:由于市场机制的市场机制在实践中普遍应用组合拍卖(CA),因为市场机制的市场机制在实际中很普遍,但其巴伊西亚纳什平衡(BNE)仍然鲜为人知。分析解决办法只对少数可以将问题重新表述为可移动的局部差异方程(PDE)的案例中已知。一般情况下,发现BNE是计算上很困难的。以往关于拍卖中BNE数字计算的工作要么依靠明确解决此类PDE,在战略空间中计算出点对点的最佳反应,或者迭接地解决受限制的子游戏。在本研究中,我们提出了一个通用的、但可扩展的多剂平衡替代学习方法,它代表着作为神经网络的战略,并应用基于自玩游戏中梯度动态的政策迭代。大多数拍卖都是事后不可区分的,因此梯度可能是不可用或误导的,而我们则依赖适当的假位估计。尽管众所周知,梯度动态不能保证与NEO的总体趋同,但我们观察到在各种拍卖中快速和有力地接近BNEEE,并提出了充分的条件。