Double Auction enables decentralized transfer of goods between multiple buyers and sellers, thus underpinning functioning of many online marketplaces. Buyers and sellers compete in these markets through bidding, but do not often know their own valuation a-priori. As the allocation and pricing happens through bids, the profitability of participants, hence sustainability of such markets, depends crucially on learning respective valuations through repeated interactions. We initiate the study of Double Auction markets under bandit feedback on both buyers' and sellers' side. We show with confidence bound based bidding, and `Average Pricing' there is an efficient price discovery among the participants. In particular, the buyers and sellers exchanging goods attain $O(\sqrt{T})$ regret in $T$ rounds. The buyers and sellers who do not benefit from exchange in turn only experience $O(\log{T}/ \Delta)$ regret in $T$ rounds where $\Delta$ is the minimum price gap. We augment our upper bound by showing that even with a known fixed price of the good -- a simpler learning problem than Double Auction -- $\omega(\sqrt{T})$ regret is unattainable in certain markets.
翻译:买主和卖主通过投标在这些市场上竞争,但往往不知道自己的估价是优先的。由于分配和定价是通过投标发生的,参与者的利润,因此这种市场的可持续性,关键取决于通过反复互动学习各自的估价。我们根据买方和卖主双方的抢匪反馈,在买主和卖主双方的抢匪反馈下发起对双重拍卖市场的研究。我们有信心地展示了基于约束的投标,而且“虚拟定价”在参与者中发现了有效的价格发现。特别是,买卖双方交换货物的买主和卖主都以美元为先令,但以美元为后退。没有从交换中得益的买主和卖主仅以美元(log{T}/\Delta)美元为后退,在美元为最低价格差距的回合中,我们开始对双重拍卖市场进行研究。我们增加了我们的上层约束,我们通过显示即使以已知的固定价格购买了货物,也比双价(美元)在一定的市场上都难以学习。我们增加了我们的上层约束。