We study a pricing setting where each customer is offered a contextualized price based on customer and/or product features that are predictive of the customer's valuation for that product. Often only historical sales records are available, where we observe whether each customer purchased a product at the price prescribed rather than the customer's true valuation. As such, the data is influenced by the historical sales policy which introduces difficulties in a) estimating future loss/regret for pricing policies without the possibility of conducting real experiments and b) optimizing new policies for downstream tasks such as revenue management. We study how to formulate loss functions which can be used for optimizing pricing policies directly, rather than going through an intermediate demand estimation stage, which can be biased in practice due to model misspecification, regularization or poor calibration. While existing approaches have been proposed when valuation data is available, we propose loss functions for the observational data setting. To achieve this, we adapt ideas from machine learning with corrupted labels, where we can consider each observed customer's outcome (purchased or not for a prescribed price), as a (known) probabilistic transformation of the customer's valuation. From this transformation we derive a class of suitable unbiased loss functions. Within this class we identify minimum variance estimators, those which are robust to poor demand function estimation, and provide guidance on when the estimated demand function is useful. Furthermore, we also show that when applied to our contextual pricing setting, estimators popular in the off-policy evaluation literature fall within this class of loss functions, and also offer managerial insights on when each estimator is likely to perform well in practice.
翻译:我们研究一个定价设置,向每个客户提供基于客户和(或)产品特点的背景化价格,预测该产品客户的估值。通常只有历史销售记录,我们观察每个客户是否以规定的价格而不是客户的真正估值购买产品。因此,数据受到历史销售政策的影响,该政策造成以下困难:(a) 估计未来定价政策的损失/回报,而没有可能进行真正的试验;(b) 优化下游任务(如收入管理)的新政策。我们研究如何制定损失功能,这些功能可以直接用于优化定价政策,而不是通过中间需求估算阶段,而这种阶段可能因模型的规格错误、正规化或校正差而在实践中产生偏差。虽然在有估值数据时提出了现有方法,但我们为观察数据设置提出了损失功能。为了做到这一点,我们用腐败标签来调整机器学习的想法,我们可以考虑我们每个观察的客户结果(购买或不是定价价格),作为(已知的)客户估价的不稳定性转变,而不是通过中间需求估算,也可能在实际中产生偏向性损失功能。我们通过这一转变来评估,在提出稳妥的估价功能时,我们提供了正确的估价功能。我们进行这种估价时,在提出正确的估价时,在提出正确的估价中进行适当的估价时,我们进行适当的估价时,我们进行这种估价的估价功能是适当的估价,我们进行适当的估价。