Product ranking is the core problem for revenue-maximizing online retailers. To design proper product ranking algorithms, various consumer choice models are proposed to characterize the consumers' behaviors when they are provided with a list of products. However, existing works assume that each consumer purchases at most one product or will keep viewing the product list after purchasing a product, which does not agree with the common practice in real scenarios. In this paper, we assume that each consumer can purchase multiple products at will. To model consumers' willingness to view and purchase, we set a random attention span and purchase budget, which determines the maximal amount of products that he/she views and purchases, respectively. Under this setting, we first design an optimal ranking policy when the online retailer can precisely model consumers' behaviors. Based on the policy, we further develop the Multiple-Purchase-with-Budget UCB (MPB-UCB) algorithms with $\~O(\sqrt{T})$ regret that estimate consumers' behaviors and maximize revenue simultaneously in online settings. Experiments on both synthetic and semi-synthetic datasets prove the effectiveness of the proposed algorithms.
翻译:产品排序是使在线零售商收入最大化的核心问题。 为了设计适当的产品排序算法, 提出了各种消费者选择模式, 以描述消费者在获得产品清单时的行为特征。 但是, 现有的工程假设, 消费者在购买产品后最多购买一种产品, 或者在购买产品后继续查看产品清单, 这与实际情况中的常见做法不符。 在本文中, 我们假设, 每个消费者都可以随意购买多种产品。 为了模拟消费者观看和购买的意愿, 我们设置了一个随机的注意范围和购买预算, 以决定他/ 她分别查看和购买的产品的最大数量。 在此背景下, 我们首先设计一个最佳的排序政策, 当在线零售商能够精确地模拟消费者行为时。 基于此政策, 我们进一步开发多- Purchase- 与 Bugt UCB (MPB- UCB) 的算法, 以 $ ⁇ O (\\\\\ qrt{T}) 来估算消费者的行为, 并同时实现在线环境中收入最大化。 在合成和半合成和半合成数据集上进行实验, 证明了拟议的算法的有效性 。