用于在边缘和雾化计算机环境中应用安置的分布式深强化学习技术 (A Distributed Deep Reinforcement Learning Technique for Application Placement in Edge and Fog Computing Environments)

Fog/Edge computing is a novel computing paradigm supporting resource-constrained Internet of Things (IoT) devices by the placement of their tasks on the edge and/or cloud servers. Recently, several Deep Reinforcement Learning (DRL)-based placement techniques have been proposed in fog/edge computing environments, which are only suitable for centralized setups. The training of well-performed DRL agents requires manifold training data while obtaining training data is costly. Hence, these centralized DRL-based techniques lack generalizability and quick adaptability, thus failing to efficiently tackle application placement problems. Moreover, many IoT applications are modeled as Directed Acyclic Graphs (DAGs) with diverse topologies. Satisfying dependencies of DAG-based IoT applications incur additional constraints and increase the complexity of placement problems. To overcome these challenges, we propose an actor-critic-based distributed application placement technique, working based on the IMPortance weighted Actor-Learner Architectures (IMPALA). IMPALA is known for efficient distributed experience trajectory generation that significantly reduces the exploration costs of agents. Besides, it uses an adaptive off-policy correction method for faster convergence to optimal solutions. Our technique uses recurrent layers to capture temporal behaviors of input data and a replay buffer to improve the sample efficiency. The performance results, obtained from simulation and testbed experiments, demonstrate that our technique significantly improves the execution cost of IoT applications up to 30\% compared to its counterparts.

翻译：Fog/Edge 计算是一种新型的计算模式,通过将任务置于边缘和/或云端服务器上,支持资源受限制的Tings Internet(IoT)设备。最近,在只适合集中设置的雾/隐蔽计算环境中,提出了若干基于深强化学习(DRL)的定位技术。对完善的DRL代理机构的培训需要多种培训数据,同时获得培训数据的成本很高。因此,这些基于DRL的集中技术缺乏通用性和快速适应性,从而无法有效解决应用安置问题。此外,许多IoT应用程序被建为具有不同地形的定向Acyloclical图(DAGs)模型。满足基于DAG-IoT应用程序的依赖性(DRL)在迷你/隐蔽计算环境中的定位环境环境环境中,提出了若干基于DRLUT(IMALA)技术的模型模型模型模型模型模型模型模型模型模型模型模型模型。许多IOLA应用都以具有高效的轨迹生成模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型,大大降低了我们30个不断整合的试化的试化的试算的试算的试算方法。此外的试算方法,从而大大降低了我们的试算方法,从而降低了了我们的试算方法的试算方法,从而降低了了我们的试算方法的试算方法,从而提高了了我们的试算方法,从而降低了了我们的试算方法,从而提高了了我们的试算方法,从而提高了我们的试算方法,从而提高了了我们的试算方法,从而提高了我们的试算方法,从而提高了我们的试算方法,从而改进了我们的试算方法,从而提高了了我们的试算方法,从而提高了了我们的试测了我们的试测了我们的试测了我们的试测了我们的试算方法。