Empirical and experimental evidence shows artificial intelligence algorithms learn to charge supracompetitive prices. In this paper we develop a theoretical model to study collusion by adaptive learning algorithms. With a fluid approximation technique, we characterize the learning outcomes in continuous time for general games and identify collusion's main driver: the coordination bias. In a simple dominant strategy game, we show how correlation between algorithms' estimates leads to persistent bias, sustaining collusive actions in the long run. We prove that algorithms using counterfactual returns to inform their updates avoid this bias and converge to dominant strategies. We design a mechanism with feedback: the designer reveals ex-post information to help counterfactual computations. We show that this mechanism implements the social optimum. Finally, we apply our framework to two simulations of price competition and auctions studied in the literature and rationalize analytically the results.
翻译:经验性和实验性证据表明,人工智能算法学会收取超竞争性价格。在本文中,我们开发了一种理论模型,研究适应性学习算法的串通。用一种流体近似技术,我们将学习结果描述为普通游戏的持续时间,并找出串通的主要驱动因素:协调偏差。在一个简单的支配性战略游戏中,我们展示了算法的估算之间如何导致持续的偏差,长期地维持串通行动。我们证明,使用反事实回报的算法来通报其更新,避免了这种偏差,而与主导性战略汇合。我们设计了一个有反馈的机制:设计者披露事后信息,以帮助反事实计算。我们展示了这一机制实施社会最佳性。最后,我们运用了我们的框架,对文献中研究的价格竞争和拍卖进行两次模拟,并对结果进行合理化分析。