This paper studies an online optimal resource reservation problem in communication networks with job transfers where the goal is to minimize the reservation cost while maintaining the blocking cost under a certain budget limit. To tackle this problem, we propose a novel algorithm based on a randomized exponentially weighted method that encompasses long-term constraints. We then analyze the performance of our algorithm by establishing an upper bound for the associated regret and the cumulative constraint violations. Finally, we present numerical experiments where we compare the performance of our algorithm with those of reinforcement learning where we show that our algorithm surpasses it.
翻译:暂无翻译