Motivated by robust dynamic resource allocation in operations research, we study the Online Learning to Transport (OLT) problem where the decision variable is a probability measure, an infinite-dimensional object. We draw connections between online learning, optimal transport, and partial differential equations through an insight called the minimal selection principle, originally studied in the Wasserstein gradient flow setting by Ambrosio et al. (2005). This allows us to extend the standard online learning framework to the infinite-dimensional setting seamlessly. Based on our framework, we derive a novel method called the minimal selection or exploration (MSoE) algorithm to solve OLT problems using mean-field approximation and discretization techniques. In the displacement convex setting, the main theoretical message underpinning our approach is that minimizing transport cost over time (via the minimal selection principle) ensures optimal cumulative regret upper bounds. On the algorithmic side, our MSoE algorithm applies beyond the displacement convex setting, making the mathematical theory of optimal transport practically relevant to non-convex settings common in dynamic resource allocation.
翻译:在业务研究中强有力的动态资源分配的推动下,我们研究了在线交通学习(OLT)问题,其中决定变量是一个概率度量,是一个无限的天体。我们通过名为“最小选择原则”的洞察力(最初在Ambrossio等人的Wasserstein梯度流设置中研究过)(2005年),将在线学习与运输(OLT)问题联系起来。我们研究了在线交通(OLT)问题,其中决定变量是一个概率度量,是一个无限的天体物体。我们通过名为“最小选择原则”的最小选择原则(2005年),将在线学习、最佳运输和部分差异方程式联系起来。这使我们能够将标准在线学习框架扩大到无缝的无限维度设置。根据我们的框架,我们得出了一种新颖的方法,叫做“最小选择或探索(MSoE)算法(MSoE)算法(MoE)算法(最低选择或探索算法(MSoE)算法(MSoE) 算法(MOLT) ),以使用平均的近似近似法和离子化技术解决 OLTLT问题。在动态资源配置中常见。