We consider the dynamic pricing problem with covariates under a generalized linear demand model: a seller can dynamically adjust the price of a product over a horizon of $T$ time periods, and at each time period $t$, the demand of the product is jointly determined by the price and an observable covariate vector $x_t\in\mathbb{R}^d$ through an unknown generalized linear model. Most of the existing literature assumes the covariate vectors $x_t$'s are independently and identically distributed (i.i.d.); the few papers that relax this assumption either sacrifice model generality or yield sub-optimal regret bounds. In this paper we show that a simple pricing algorithm has an $O(d\sqrt{T}\log T)$ regret upper bound without assuming any statistical structure on the covariates $x_t$ (which can even be arbitrarily chosen). The upper bound on the regret matches the lower bound (even under the i.i.d. assumption) up to logarithmic factors. Our paper thus shows that (i) the i.i.d. assumption is not necessary for obtaining low regret, and (ii) the regret bound can be independent of the (inverse) minimum eigenvalue of the covariance matrix of the $x_t$'s, a quantity present in previous bounds. Furthermore, we discuss a condition under which a better regret is achievable and how a Thompson sampling algorithm can be applied to give an efficient computation of the prices.
翻译:我们认为,在一般线性需求模式下,使用共线性可变模式的动态定价问题:卖方可以动态地调整产品价格,时间范围为$T美元,在每一时间段,产品需求由价格和可见的共变矢量 $x_t\in\mathbb{R ⁇ {R ⁇ d$通过未知的通用线性模式共同确定。大多数现有文献假设共变矢量 $x_t$是独立和同样分布的(i.d.d.);放松这一假设的少数文件要么是牺牲模型一般性,要么是产生次最佳的遗憾界限。在本文中,我们表明简单的定价算法有美元(d\qqrt{T ⁇ log T) 和可见的共变数矢量(d.i.i.d.d.d.),但不包含任何可任意选择的共变数统计结构。 遗憾的上限(即使根据i.i.i.d.假设) 与对对正数因素的对比。因此,我们的文件表明,(i.i.i.d.d.d.a) 和I.d.d.