We study a multi-round welfare-maximising mechanism design problem in instances where agents do not know their values. On each round, a mechanism first assigns an allocation each to a set of agents and charges them a price; at the end of the round, the agents provide (stochastic) feedback to the mechanism for the allocation they received. This setting is motivated by applications in cloud markets and online advertising where an agent may know her value for an allocation only after experiencing it. Therefore, the mechanism needs to explore different allocations for each agent so that it can learn their values, while simultaneously attempting to find the socially optimal set of allocations. Our focus is on truthful and individually rational mechanisms which imitate the classical VCG mechanism in the long run. To that end, we first define three notions of regret for the welfare, the individual utilities of each agent and that of the mechanism. We show that these three terms are interdependent via an $\Omega(T^{\frac{2}{3}})$ lower bound for the maximum of these three terms after $T$ rounds of allocations, and describe an algorithm which essentially achieves this rate. Our framework also provides flexibility to control the pricing scheme so as to trade-off between the agent and seller regrets. Next, we define asymptotic variants for the truthfulness and individual rationality requirements and provide asymptotic rates to quantify the degree to which both properties are satisfied by the proposed algorithm.
翻译:在代理商不知道其价值的情况下,我们研究一个多方面福利最大化机制的设计问题。在每一回合中,一个机制首先对一组代理商进行分配,并收取价格;在回合结束时,代理商为收到的分配机制提供(随机)反馈。这一设置的动机是在云市场和在线广告中的应用,代理商只有在经历了云市场和在线广告之后才可能知道其对分配的价值。因此,该机制需要为每个代理商探索不同的分配额,以便它能够了解其价值,同时试图找到社会最佳的分配款。我们的重点是真实和个别的合理机制,长期模仿经典VCG机制。为此,我们首先为福利、每个代理商的个别公用事业和机制确定了三种遗憾概念。我们表明,这三个条件通过一个$(T ⁇ frac{2 ⁇ 3 ⁇ 3 ⁇ )来相互依存,在分配款回合之后,对这三个条件的最大限制要小一些,同时试图找到社会最佳的分配款组合。我们的重点是真实的和单独算法率。我们的框架也提供了对真实性价格的灵活度,即我们作为真实性价格的排序,我们提供了对真实性、真实性要求的排序的判断。