Feature-based dynamic pricing is an increasingly popular model of setting prices for highly differentiated products with applications in digital marketing, online sales, real estate and so on. The problem was formally studied as an online learning problem [Javanmard & Nazerzadeh, 2019] where a seller needs to propose prices on the fly for a sequence of $T$ products based on their features $x$ while having a small regret relative to the best -- "omniscient" -- pricing strategy she could have come up with in hindsight. We revisit this problem and provide two algorithms (EMLP and ONSP) for stochastic and adversarial feature settings, respectively, and prove the optimal $O(d\log{T})$ regret bounds for both. In comparison, the best existing results are $O\left(\min\left\{\frac{1}{\lambda_{\min}^2}\log{T}, \sqrt{T}\right\}\right)$ and $O(T^{2/3})$ respectively, with $\lambda_{\min}$ being the smallest eigenvalue of $\mathbb{E}[xx^T]$ that could be arbitrarily close to $0$. We also prove an $\Omega(\sqrt{T})$ information-theoretic lower bound for a slightly more general setting, which demonstrates that "knowing-the-demand-curve" leads to an exponential improvement in feature-based dynamic pricing.
翻译:基于地物的动态定价是一种日益流行的确定高度差别化产品价格的模式,其应用在数字营销、在线销售、房地产等等方面。这个问题被正式作为在线学习问题[Javanmard & Nazerzadeh, 2019] 来研究。 在网上学习中,卖方需要根据自己的特征提出以美元为单位的一连串T$价格,而对于最佳产品 -- -- "无所不知的" -- -- 定价战略,她可以在事后看到。我们重新研究这一问题,并分别为随机和对抗性地物设置提供两种算法(EMLP和ONSP),并证明两者都是最理想的美元(d\log{T} 。相比之下,最好的现有结果是美元(left) (min\left\forc{1unlumbda}\\\\\\\\\\\\\\\} log{T} 相对于最佳的(sqrqrentrentrial) $(trentrick$(t{trentrick_rick_ral_right} and $\\\\\\\\\\\\\\\\\ calmaxrent) lax group $, $(lam) $ broq) $(lation), laxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) 最最接近的O_ral_