The fifth generation (5G) of wireless networks is set out to meet the stringent requirements of vehicular use cases. Edge computing resources can aid in this direction by moving processing closer to end-users, reducing latency. However, given the stochastic nature of traffic loads and availability of physical resources, appropriate auto-scaling mechanisms need to be employed to support cost-efficient and performant services. To this end, we employ Deep Reinforcement Learning (DRL) for vertical scaling in Edge computing to support vehicular-to-network communications. We address the problem using Deep Deterministic Policy Gradient (DDPG). As DDPG is a model-free off-policy algorithm for learning continuous actions, we introduce a discretization approach to support discrete scaling actions. Thus we address scalability problems inherent to high-dimensional discrete action spaces. Employing a real-world vehicular trace data set, we show that DDPG outperforms existing solutions, reducing (at minimum) the average number of active CPUs by 23% while increasing the long-term reward by 24%.
翻译:第五代(5G)无线网络用于满足车辆使用案例的严格要求。 边缘计算资源可以通过更接近终端用户的处理来帮助这一方向,减少延迟性。 但是,鉴于交通负荷的随机性和实际资源的可用性,需要使用适当的自动缩放机制来支持具有成本效益和性能的服务。 为此,我们采用深加学习(DRL)在电磁计算中进行垂直缩放,以支持车辆对网络通信。我们使用深成分级政策梯度(DDPG)来解决这个问题。由于DDPG是学习持续行动的无型非政策算法,我们采用了一种分散化方法来支持分散的缩放行动。因此,我们解决了高维离散行动空间固有的缩放问题。我们使用了一个真实世界的视频跟踪数据,我们显示DDPG超越了现有解决方案,(至少)将活跃的CPU的平均数量减少23%,同时将长期奖励增加24%。