We consider a seller offering a large network of $N$ products over a time horizon of $T$ periods. The seller does not know the parameters of the products' linear demand model, and can dynamically adjust product prices to learn the demand model based on sales observations. The seller aims to minimize its pseudo-regret, i.e., the expected revenue loss relative to a clairvoyant who knows the underlying demand model. We consider a sparse set of demand relationships between products to characterize various connectivity properties of the product network. In particular, we study three different sparsity frameworks: (1) $L_0$ sparsity, which constrains the number of connections in the network, and (2) off-diagonal sparsity, which constrains the magnitude of cross-product price sensitivities, and (3) a new notion of spectral sparsity, which constrains the asymptotic decay of a similarity metric on network nodes. We propose a dynamic pricing-and-learning policy that combines the optimism-in-the-face-of-uncertainty and PAC-Bayesian approaches, and show that this policy achieves asymptotically optimal performance in terms of $N$ and $T$. We also show that in the case of spectral and off-diagonal sparsity, the seller can have a pseudo-regret linear in $N$, even when the network is dense.
翻译:我们考虑的是卖方在一定时间范围内提供由美元构成的大型产品网络,其时间范围为美元。卖方不知道产品线性需求模型的参数,可以动态地调整产品价格,以根据销售观察了解需求模型。卖方的目标是尽量减少假的收益损失,即与了解基本需求模型的Clairvoyant相比的预期收入损失。我们考虑的是产品之间为数不多的一组需求关系,以说明产品网络的各种连通性特性。特别是,我们研究三种不同的松散框架:(1) $_0的松散度,这限制了网络中的连接数量,以及(2) 不对面的松散性,这限制了跨产品价格敏感性的程度,以及(3) 光谱性新概念,它制约了网络节点上类似度度度度度度的无症状腐蚀。我们提出了一种动态定价和学习政策,它结合了对面和PAC-Baiesian的乐观,它限制了网络中的连接次数,并且表明,当我们能够以最优的方式展示这个政策,当我们能够以最优的方式展示了以最优的方式表现。