We study an off-policy contextual pricing problem where the seller has access to samples of prices that customers were previously offered, whether they purchased at that price, and auxiliary features describing the customer and/or item being sold. This is in contrast to the well-studied setting in which samples of the customer's valuation (willingness to pay) are observed. In our setting, the observed data is influenced by the previous pricing policy, and we do not know how customers would have responded to alternative prices. We introduce suitable loss functions for this setting that can be directly optimized to find an effective pricing policy with expected revenue guarantees, without the need for estimation of an intermediate demand function. We focus on convex loss functions. This is particularly relevant when linear pricing policies are desired for interpretability reasons, resulting in a tractable convex revenue optimization problem. We propose generalized hinge and quantile pricing loss functions that price at a multiplicative factor of the conditional expected valuation or a particular quantile of the prices that sold, despite the valuation data not being observed. We prove expected revenue bounds for these pricing policies respectively when the valuation distribution is log-concave, and we provide generalization bounds for the finite sample case. Finally, we conduct simulations on both synthetic and real-world data to demonstrate that this approach is competitive with, and in some settings outperforms, state-of-the-art methods in contextual pricing.
翻译:我们研究了一个政策外的定价问题,即卖方能够获得客户以前提供的价格样本,是否以该价格购买,以及描述客户和(或)所售物品的辅助性特征。这与观察客户估价(支付意愿)样本的周密背景形成对照。在我们所处的环境中,观察到的数据受到以前的定价政策的影响,我们不知道客户将如何对替代价格作出反应。我们为这一环境引入了适当的损失功能,可以直接优化,找到有效的定价政策,提供预期的收入保证,而不必估计中间需求功能。我们侧重于convex损失功能。当线性定价政策需要解释性理由时,这特别相关,导致可移植的 convex收入优化问题。我们提出通用的临界和微量定价损失功能,即以有条件的预期估值或售价的特殊四分法作为价格的倍复制因素,尽管没有遵守估值数据。我们证明这些定价政策在估值分配为正对正值配置时,预期的收入将分别与预期值挂钩。我们用直线性定价政策来解释,我们最后用直线性价格模型来展示,我们用总体的定式方法来模拟。我们最后用合成价格模型来展示。