特定模式下的学习模式 (Learning Underspecified Models)

This paper examines whether one can learn to play an optimal action while only knowing part of true specification of the environment. We choose the optimal pricing problem as our laboratory, where the monopolist is endowed with an underspecified model of the market demand, but can observe market outcomes. In contrast to conventional learning models where the model specification is complete and exogenously fixed, the monopolist has to learn the specification and the parameters of the demand curve from the data. We formulate the learning dynamics as an algorithm that forecast the optimal price based on the data, following the machine learning literature (Shalev-Shwartz and Ben-David (2014)). Inspired by PAC learnability, we develop a new notion of learnability by requiring that the algorithm must produce an accurate forecast with a reasonable amount of data uniformly over the class of models consistent with the part of the true specification. In addition, we assume that the monopolist has a lexicographic preference over the payoff and the complexity cost of the algorithm, seeking an algorithm with a minimum number of parameters subject to PAC-guaranteeing the optimal solution (Rubinstein (1986)). We show that for the set of demand curves with strictly decreasing uniformly Lipschitz continuous marginal revenue curve, the optimal algorithm recursively estimates the slope and the intercept of the linear demand curve, even if the actual demand curve is not linear. The monopolist chooses a misspecified model to save computational cost, while learning the true optimal decision uniformly over the set of underspecified demand curves.

翻译：本文审视了人们能否学会在知道环境真实规格的一部分的同时采取最佳行动。我们选择了最佳定价问题作为实验室,因为垄断者拥有一个未详细说明的市场需求模型,但可以观察市场结果。与模型规格完整和外部固定的常规学习模型相比,垄断者必须学习数据需求曲线的规格和参数。我们根据机器学习文献(Shalev-Shwartz和Ben-David(2014)),将学习动态作为一种算法,根据数据预测最佳价格。我们从中选择最佳价格问题作为实验室。在PAC学习的启发下,我们形成了一种新的可学习性概念,要求算法必须产生准确的预测,数据数量要合理,要高于与真实规格部分一致的模型类别。此外,我们假设垄断者比报酬和算法的复杂成本更偏重更偏重。我们根据机器学习文献(Rubinstein(1986年)),我们从PAC学到的最小参数中找到一个最起码数量的算法值。我们发现,在确定最低的准确的货币曲线下,最接近的正值的曲线要求是准确的曲线,在精确的曲线上,在精确的曲线上,最接近的曲线上,在精确的曲线上比平的曲线上,比的曲线上,在精确的曲线上,比正值的曲线上,比正值的正值的比值的比值的正值的比值的比值的正值的正值的正值的比值的比。