The complexity of emerging sixth-generation (6G) wireless networks has sparked an upsurge in adopting artificial intelligence (AI) to underpin the challenges in network management and resource allocation under strict service level agreements (SLAs). It inaugurates the era of massive network slicing as a distributive technology where tenancy would be extended to the final consumer through pervading the digitalization of vertical immersive use-cases. Despite the promising performance of deep reinforcement learning (DRL) in network slicing, lack of transparency, interpretability, and opaque model concerns impedes users from trusting the DRL agent decisions or predictions. This problem becomes even more pronounced when there is a need to provision highly reliable and secure services. Leveraging eXplainable AI (XAI) in conjunction with an explanation-guided approach, we propose an eXplainable reinforcement learning (XRL) scheme to surmount the opaqueness of black-box DRL. The core concept behind the proposed method is the intrinsic interpretability of the reward hypothesis aiming to encourage DRL agents to learn the best actions for specific network slice states while coping with conflict-prone and complex relations of state-action pairs. To validate the proposed framework, we target a resource allocation optimization problem where multi-agent XRL strives to allocate optimal available radio resources to meet the SLA requirements of slices. Finally, we present numerical results to showcase the superiority of the adopted XRL approach over the DRL baseline. As far as we know, this is the first work that studies the feasibility of an explanation-guided DRL approach in the context of 6G networks.
翻译:可信的6G RAN分割的解释引导深度强化学习
翻译后的摘要:
第六代(6G)无线网络的复杂性引发了采用人工智能(AI)支持网管和资源分配的挑战,尤其在严格的服务水平协议(SLA)下。它开创了通过普及垂直沉浸式用例的数字化,将对最终用户开放多租户的大规模网络分割的时代。尽管深度强化学习(DRL)在网络分割方面具有良好的性能,但缺乏透明度、可解释性和不透明的模型问题阻碍了用户信任DRL代理的决策或预测。当需要提供高度可靠和安全的服务时,这个问题变得更加突出。借助可解释的AI(XAI)以及解释引导方法,我们提出了一个解释强化学习(XRL)方案来克服黑盒DRL的不透明性问题。该提议方法背后的核心概念是奖励假设的内在可解释性,旨在鼓励DRL代理学习特定网络分割状态的最佳动作,同时处理冲突和复杂的状态-动作对关系。为了验证所提出的框架,我们针对资源分配优化问题,其中多智能体XRL努力分配最佳可用无线电资源以满足分片的SLA要求。最后,我们提供数字结果展示所采用的XRL方法优于DRL基准的卓越性。据我们所知,这是第一个研究解释引导DRL方法在6G网络上的可行性的工作。