利用防护强化学习系统进行深海计算机网络深电网深学习模型分布式培训 (Distributed Training for Deep Learning Models On An Edge Computing Network Using ShieldedReinforcement Learning)

Edge devices with local computation capability has made distributed deep learning training on edges possible. In such method, the cluster head of a cluster of edges schedules DL training jobs from the edges. Using such centralized scheduling method, the cluster head knows all loads of edges, which can avoid overloading the cluster edges, but the head itself may become overloaded. To handle this problem, we propose a multi-agent RL (MARL) system that enables each edge to schedule its jobs using RL. However, without coordination among edges, action collision may occur, in which multiple edges schedule tasks to the same edge and make it overloaded. For this reason, we propose a system called Shielded ReinfOrcement learning (RL) based DL training on Edges (SROLE). In SROLE, the shield deployed in an edge checks action collisions and provides alternative actions to avoid collisions. As the central shield for entire cluster may become a bottleneck, we further propose a decentralized shielding method, where different shields are responsible for different regions in the cluster and they coordinate to avoid action collisions on the region boundaries. Our emulation and real device experiments show SROLE reduces training time by 59% compared to MARL and centralized RL.

翻译：具有本地计算能力的多边缘装置使得在边缘有可能进行深度学习培训。在这种方法中, 一组边缘的组群头将 DL 培训任务排在边缘。使用这种集中化的时间安排方法, 集群头了解所有边缘的负荷, 这可以避免集聚边缘超负荷, 但头部本身可能会超负荷。为了解决这个问题, 我们提议了一个多试剂RL( MARL) 系统, 使每个边缘都能使用 RL 安排工作。但是, 没有边缘间协调, 可能会发生行动碰撞, 多边缘将任务排在同一边缘, 并让其超负荷。为此, 我们提议了一个基于 Edge (SROLE) 的 DL 培训系统, 称为保护性莱夫罗列。在 SROLE 中, 在边缘制动动作碰撞中部署的盾牌, 并提供避免碰撞的替代行动。由于整个组群群的中央盾可能成为一个瓶颈, 我们进一步提议一种分散的屏蔽方法, 由不同的屏蔽负责对不同区域负责, 并协调避免在区域中央边界上的行动碰撞。我们的SemL 和MAL 实验显示实际的RL 。