Automating network processes without human intervention is crucial for the complex Sixth Generation (6G) environment. Thus, 6G networks must advance beyond basic automation, relying on Artificial Intelligence (AI) and Machine Learning (ML) for self-optimizing and autonomous operation. This requires zero-touch management and orchestration, the integration of Network Intelligence (NI) into the network architecture, and the efficient lifecycle management of intelligent functions. Despite its potential, integrating NI poses challenges in model development and application. To tackle those issues, this paper presents a novel methodology to manage the complete lifecycle of Reinforcement Learning (RL) applications in networking, thereby enhancing existing Machine Learning Operations (MLOps) frameworks to accommodate RL-specific tasks. We focus on scaling computing resources in service-based architectures, modeling the problem as a Markov Decision Process (MDP). Two RL algorithms, guided by distinct Reward Functions (RFns), are proposed to autonomously determine the number of service replicas in dynamic environments. Our proposed methodology is anchored on a dual approach: firstly, it evaluates the training performance of these algorithms under varying RFns, and secondly, it validates their performance after being trained to discern the practical applicability in real-world settings. We show that, despite significant progress, the development stage of RL techniques for networking applications, particularly in scaling scenarios, still leaves room for significant improvements. This study underscores the importance of ongoing research and development to enhance the practicality and resilience of RL techniques in real-world networking environments.
翻译:暂无翻译