We propose a new architecture to approximately learn incentive compatible, revenue-maximizing auctions from sampled valuations. Our architecture uses the Sinkhorn algorithm to perform a differentiable bipartite matching which allows the network to learn strategyproof revenue-maximizing mechanisms in settings not learnable by the previous RegretNet architecture. In particular, our architecture is able to learn mechanisms in settings without free disposal where each bidder must be allocated exactly some number of items. In experiments, we show our approach successfully recovers multiple known optimal mechanisms and high-revenue, low-regret mechanisms in larger settings where the optimal mechanism is unknown.
翻译:我们提出一个新的架构,以从抽样估值中学习激励兼容、收入最大化的拍卖。 我们的架构使用辛克霍恩算法来进行一种不同的双方匹配,让网络在先前的雷布雷特网络架构无法学习的环境中学习战略性收入最大化机制。 特别是,我们的架构能够在没有免费处置的情况下学习各种机制,而每个投标人必须被分配到一定数量的物品。 在实验中,我们展示了我们的方法,在未知最佳机制的大环境中,成功恢复了多种已知的最佳机制和高收入、低收入机制。