In this paper, we consider a platform of flying mobile edge computing (F-MEC), where unmanned aerial vehicles (UAVs) serve as equipment providing computation resource, and they enable task offloading from user equipment (UE). We aim to minimize energy consumption of all the UEs via optimizing the user association, resource allocation and the trajectory of UAVs. To this end, we first propose a Convex optimizAtion based Trajectory control algorithm (CAT), which solves the problem in an iterative way by using block coordinate descent (BCD) method. Then, to make the real-time decision while taking into account the dynamics of the environment (i.e., UAV may take off from different locations), we propose a deep Reinforcement leArning based Trajectory control algorithm (RAT). In RAT, we apply the Prioritized Experience Replay (PER) to improve the convergence of the training procedure. Different from the convex optimization based algorithm which may be susceptible to the initial points and requires iterations, RAT can be adapted to any taking off points of the UAVs and can obtain the solution more rapidly than CAT once training process has been completed. Simulation results show that the proposed CAT and RAT achieve the similar performance and both outperform traditional algorithms.
翻译:在本文中,我们考虑的是飞行移动边缘计算(F-MEC)的平台,无人驾驶飞行器(UAVs)作为提供计算资源的设备,能够从用户设备(UE)中卸载任务。我们的目标是通过优化用户协会、资源分配和UAV的轨迹,最大限度地减少所有UE的能源消耗。我们为此首先建议采用基于Convex优化的轨迹控制算法(CAT),该算法通过使用块坐标下降(BCD)方法,以迭接方式解决问题。然后,为了在考虑环境动态(即UAV可能从不同地点起飞)的同时作出实时决定,我们提议采用基于深度加固的轨迹控制算法(RAT),以尽量减少所有UE的能源消耗。在RAAT中,我们采用基于优先经验重放的算法(PER)来改进培训程序的趋同。不同于基于Convex优化算法,该算法可能易到初始点,需要反复使用。然后,RAT可以调整成任何取UAVs的离点,并能够更快速地获得基于传统的解决方案,一旦完成反光学和反光变算法过程后,就能够取得类似的反反向反射法。