This paper considers constrained online dispatching with unknown arrival, reward and constraint distributions. We propose a novel online dispatching algorithm, named POND, standing for Pessimistic-Optimistic oNline Dispatching, which achieves $O(\sqrt{T})$ regret and $O(1)$ constraint violation. Both bounds are sharp. Our experiments on synthetic and real datasets show that POND achieves low regret with minimal constraint violations.
翻译:本文认为在线发送受到限制,且其抵达、奖赏和约束分布不明。 我们提出一个新的在线发送算法,名为POND(POND ), 支持悲观主义-乐观主义的Online 发送, 实现$O(sqrt{T}) 的遗憾和$O(1) 限制违约。 两者的界限都很尖锐。 我们在合成和真实数据集方面的实验显示, POND(POND)在最小的限制违规情况下取得了低度的遗憾。