This paper introduces a dual-based algorithm framework for solving the regularized online resource allocation problems, which have cumulative convex rewards, hard resource constraints, and a non-separable regularizer. Under a strategy of adaptively updating the resource constraints, the proposed framework only requests an approximate solution to the empirical dual problem up to a certain accuracy, and yet delivers an optimal logarithmic regret under a locally strongly convex assumption. Surprisingly, a delicate analysis of dual objective function enables us to eliminate the notorious loglog factor in regret bound. The flexible framework renders renowned and computationally fast algorithms immediately applicable, e.g., dual gradient descent and stochastic gradient descent. A worst-case square-root regret lower bound is established if the resource constraints are not adaptively updated during dual optimization, which underscores the critical role of adaptive dual variable update. Comprehensive numerical experiments and real data application demonstrate the merits of proposed algorithm framework.
翻译:本文提出了解决正规化在线资源分配问题的双基算法框架,这些问题具有累积的混凝土奖赏、硬资源限制和不可分离的固定值。根据适应性更新资源限制的战略,拟议框架只要求以某种精确度大致解决经验双重问题,但根据当地强烈的共性假设,却提供了最佳的对数遗憾。令人惊讶的是,对双重目标功能的微妙分析使我们得以排除臭名昭著的对数因素,使我们感到后悔。灵活框架使出名和计算速度快的算法立即适用,例如双梯度下降和随机梯度下降。如果资源限制在双重优化期间没有适应性更新,则最坏的平方位遗憾被确定为较低约束,这突出表明了适应性双重变数更新的关键作用。全面的数字试验和实际数据应用证明了拟议算法框架的优点。