Optimizing the assortment of products to display to customers is a key to increasing revenue for both offline and online retailers. To trade-off between exploring customers' preference and exploiting customers' choices learned from data, in this paper, by adopting the Multi-Nomial Logit (MNL) choice model to capture customers' choices over products, we study the problem of optimizing assortments over a planning horizon $T$ for maximizing the profit of the retailer. To make the problem setting more practical, we consider both the inventory constraint and the limited switches constraint, where the retailer cannot use up the resource inventory before time $T$ and is forbidden to switch the assortment shown to customers too many times. Such a setting suits the case when an online retailer wants to dynamically optimize the assortment selection for a population of customers. We develop an efficient UCB-like algorithm to optimize the assortments while learning customers' choices from data. We prove that our algorithm can achieve a sub-linear regret bound $\tilde{O}\left(T^{1-\alpha/2}\right)$ if $O(T^\alpha)$ switches are allowed. %, and our regret bound is optimal with respect to $T$. Extensive numerical experiments show that our algorithm outperforms baselines and the gap between our algorithm's performance and the theoretical upper bound is small.
翻译:优化产品向客户展示的分类是增加离线零售商和在线零售商收入的关键。为了在探讨客户偏好和利用从数据中得出的客户选择之间权衡取舍,本文件采用多种名称登录(MNL)选择模式以获取客户对产品的选择,我们研究了在规划范围内优化分类的问题,以便最大限度地增加零售商的利润。为了使问题更加实际化,我们考虑到库存限制和有限的开关限制,即零售商无法在美元之前使用资源库存,并且禁止将显示的批量转换给客户太多次。当在线零售商希望动态优化客户对产品的批量选择时,这种设置符合情况。我们开发了一种高效的UCB式算法,以便在从数据中学习客户对批量的最大利润。我们证明我们的算法可以实现亚线性遗憾,将美元(O_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR__BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_我们对美元和O_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BRI) 显示我们业绩是允许的,如果我们最优化的排序。我们最优化的排序。我们最优化的算法。我们最优化的算法。我们最优化的算法。