We consider a seller faced with buyers which have the ability to delay their decision, which we call patience. Each buyer's type is composed of value and patience, and it is sampled i.i.d. from a distribution. The seller, using posted prices, would like to maximize her revenue from selling to the buyer. In this paper, we formalize this setting and characterize the resulting Stackelberg equilibrium, where the seller first commits to her strategy, and then the buyers best respond. Following this, we show how to compute both the optimal pure and mixed strategies. We then consider a learning setting, where the seller does not have access to the distribution over buyer's types. Our main results are the following. We derive a sample complexity bound for the learning of an approximate optimal pure strategy, by computing the fat-shattering dimension of this setting. Moreover, we provide a general sample complexity bound for the approximate optimal mixed strategy. We also consider an online setting and derive a vanishing regret bound with respect to both the optimal pure strategy and the optimal mixed strategy.
翻译:我们认为,卖方面对的是有能力推迟其决定的买方,我们称之为耐心。每个买方的类型都由价值和耐心组成,并且从分销中抽取i.d.d.。卖方希望用已上市的价格最大限度地增加其向买方出售的收入。在本文中,我们正式确定这一设置,并描述由此产生的斯塔克伯格平衡,卖方首先承诺其策略,然后买方作出最佳反应。在此之后,我们展示如何计算最佳的纯度和混合策略。然后,我们考虑一个学习环境,卖方无法在买方类型上进行分销。我们的主要结果如下:我们通过计算这一环境的脂肪分层,得出一个样本复杂度,以学习大致最佳的纯度战略。此外,我们为大致最佳的混合策略提供了一般的样本复杂性。我们还考虑一个在线设置,并在最佳的纯度和最佳混合策略上产生消亡的遗憾。