We consider the problem of learning stable matchings with unknown preferences in a decentralized and uncoordinated manner, where "decentralized" means that players make decisions individually without the influence of a central platform, and "uncoordinated" means that players do not need to synchronize their decisions using pre-specified rules. First, we provide a game formulation for this problem with known preferences, where the set of pure Nash equilibria (NE) coincides with the set of stable matchings, and mixed NE can be rounded to a stable matching. Then, we show that for hierarchical markets, applying the exponential weight (EXP) learning algorithm to the stable matching game achieves logarithmic regret in a fully decentralized and uncoordinated fashion. Moreover, we show that EXP converges locally and exponentially fast to a stable matching in general markets. We also introduce another decentralized and uncoordinated learning algorithm that globally converges to a stable matching with arbitrarily high probability. Finally, we provide stronger feedback conditions under which it is possible to drive the market faster toward an approximate stable matching. Our proposed game-theoretic framework bridges the discrete problem of learning stable matchings with the problem of learning NE in continuous-action games.
翻译:暂无翻译