We consider assortment optimization over a continuous spectrum of products represented by the unit interval, where the seller's problem consists of determining the optimal subset of products to offer to potential customers. To describe the relation between assortment and customer choice, we propose a probabilistic choice model that forms the continuous counterpart of the widely studied discrete multinomial logit model. We consider the seller's problem under incomplete information, propose a stochastic-approximation type of policy, and show that its regret -- its performance loss compared to the optimal policy -- is only logarithmic in the time horizon. We complement this result by showing a matching lower bound on the regret of any policy, implying that our policy is asymptotically optimal. We then show that adding a capacity constraint significantly changes the structure of the problem: we construct a policy and show that its regret after $T$ time periods is bounded above by a constant times $T^{2/3}$ (up to a logarithmic term); in addition, we show that the regret of any policy is bounded from below by a positive constant times $T^{2/3}$, so that also in the capacitated case we obtain asymptotic optimality. Numerical illustrations show that our policies outperform or are on par with alternatives.
翻译:我们考虑对以单位间隔为代表的连续系列产品进行批量优化,因为卖方的问题在于确定向潜在客户提供的产品的最佳组别。为了描述分类和客户选择之间的关系,我们提议了一个概率选择模式,作为广泛研究的离散多元对账模式的连续对应方。我们认为卖方的问题在信息不完整的情况下存在,提出一种随机调节型政策类型,并表明卖方的遗憾 -- -- 与最佳政策相比,其性能损失 -- -- 在时间范围内只是对调而已。我们通过显示对任何政策的遗憾的相对比性较低约束来补充这一结果,意味着我们的政策是非现时最佳的。我们然后表明,增加能力限制会大大改变问题的结构:我们制定政策,并表明在美元时间段之后,卖方的遗憾被固定时间乘以$T ⁇ 2/3美元(直至对数术语);此外,我们还表明,任何政策的遗憾都与下面的政策相束缚,因为任何政策都比不上任何政策的相对比,因此,在正值上显示我们最优的汇率是正值。