Task scheduling is a critical problem when one user offloads multiple different tasks to the edge server. When a user has multiple tasks to offload and only one task can be transmitted to server at a time, while server processes tasks according to the transmission order, the problem is NP-hard. However, it is difficult for traditional optimization methods to quickly obtain the optimal solution, while approaches based on reinforcement learning face with the challenge of excessively large action space and slow convergence. In this paper, we propose a Digital Twin (DT)-assisted RL-based task scheduling method in order to improve the performance and convergence of the RL. We use DT to simulate the results of different decisions made by the agent, so that one agent can try multiple actions at a time, or, similarly, multiple agents can interact with environment in parallel in DT. In this way, the exploration efficiency of RL can be significantly improved via DT, and thus RL can converges faster and local optimality is less likely to happen. Particularly, two algorithms are designed to made task scheduling decisions, i.e., DT-assisted asynchronous Q-learning (DTAQL) and DT-assisted exploring Q-learning (DTEQL). Simulation results show that both algorithms significantly improve the convergence speed of Q-learning by increasing the exploration efficiency.
翻译:当一个用户在边缘服务器上卸下多重不同任务时,任务时间安排是一个关键问题。当一个用户需要卸下多重任务时,只有一项任务可以一次传送给服务器,而服务器根据传输命令处理任务时,问题就在于NP-硬性;然而,传统的优化方法很难迅速获得最佳解决办法,而基于强化学习的方法则面临过大行动空间和缓慢趋同的挑战,而基于过度行动空间和缓慢趋同的挑战。在本文件中,我们提议采用数字双双(DT)辅助的RL任务时间安排方法,以便改进RL的性能和趋同。我们使用DT模拟代理作出的不同决定的结果,以便一个代理器可以一次尝试多重行动,或者同样,多个代理器可以同时在DT中与环境互动。这样,基于强化学习的方法的探索效率可以通过DT(DTAQQ)得到大幅提高,因此RL可以更快和当地的最佳性更不可能实现。我们设计了两种算法来做出任务时间安排决定,即DT-辅助的同步Q-Q-C-C-C-leg-training Q-traininging Q(DAQ-training-trading Q-transleval-lexal-legilking)和S-legal-leglegal