We study contextual dynamic pricing when a target market can leverage K auxiliary markets -- offline logs or concurrent streams -- whose mean utilities differ by a structured preference shift. We propose Cross-Market Transfer Dynamic Pricing (CM-TDP), the first algorithm that provably handles such model-shift transfer and delivers minimax-optimal regret for both linear and non-parametric utility models. For linear utilities of dimension d, where the difference between source- and target-task coefficients is $s_{0}$-sparse, CM-TDP attains regret $\tilde{O}((d*K^{-1}+s_{0})\log T)$. For nonlinear demand residing in a reproducing kernel Hilbert space with effective dimension $\alpha$, complexity $\beta$ and task-similarity parameter $H$, the regret becomes $\tilde{O}\!(K^{-2\alpha\beta/(2\alpha\beta+1)}T^{1/(2\alpha\beta+1)} + H^{2/(2\alpha+1)}T^{1/(2\alpha+1)})$, matching information-theoretic lower bounds up to logarithmic factors. The RKHS bound is the first of its kind for transfer pricing and is of independent interest. Extensive simulations show up to 50% lower cumulative regret and 5 times faster learning relative to single-market pricing baselines. By bridging transfer learning, robust aggregation, and revenue optimization, CM-TDP moves toward pricing systems that transfer faster, price smarter.
翻译:本文研究情境化动态定价问题,其中目标市场可利用K个辅助市场——离线日志或并发数据流——这些市场的平均效用存在结构化偏好差异。我们提出跨市场迁移动态定价算法,这是首个能够严格处理此类模型偏移迁移,并为线性和非参数效用模型提供极小极大最优遗憾界的算法。对于维度为d的线性效用模型,其中源任务与目标任务系数差异具有$s_{0}$稀疏性,该算法获得$\tilde{O}((d*K^{-1}+s_{0})\log T)$的遗憾界。对于位于再生核希尔伯特空间中的非线性需求函数,其有效维度为$\alpha$、复杂度为$\beta$、任务相似性参数为$H$,遗憾界变为$\tilde{O}\!(K^{-2\alpha\beta/(2\alpha\beta+1)}T^{1/(2\alpha\beta+1)} + H^{2/(2\alpha+1)}T^{1/(2\alpha+1)})$,在对数因子范围内匹配信息论下界。该再生核希尔伯特空间界限是迁移定价领域的首创结果,具有独立的理论价值。大量仿真实验表明,相较于单市场定价基准方法,该算法累计遗憾降低达50%,学习速度提升5倍。通过融合迁移学习、鲁棒聚合与收益优化技术,该算法推动了定价系统向“更快迁移、更智能定价”的方向发展。