As cellular networks evolve towards the 6th Generation (6G), Machine Learning (ML) is seen as a key enabling technology to improve the capabilities of the network. ML provides a methodology for predictive systems, which, in turn, can make networks become proactive. This proactive behavior of the network can be leveraged to sustain, for example, a specific Quality of Service (QoS) requirement. With predictive Quality of Service (pQoS), a wide variety of new use cases, both safety- and entertainment-related, are emerging, especially in the automotive sector. Therefore, in this work, we consider maximum throughput prediction enhancing, for example, streaming or HD mapping applications. We discuss the entire ML workflow highlighting less regarded aspects such as the detailed sampling procedures, the in-depth analysis of the dataset characteristics, the effects of splits in the provided results, and the data availability. Reliable ML models need to face a lot of challenges during their lifecycle. We highlight how confidence can be built on ML technologies by better understanding the underlying characteristics of the collected data. We discuss feature engineering and the effects of different splits for the training processes, showcasing that random splits might overestimate performance by more than twofold. Moreover, we investigate diverse sets of input features, where network information proved to be most effective, cutting the error by half. Part of our contribution is the validation of multiple ML models within diverse scenarios. We also use Explainable AI (XAI) to show that ML can learn underlying principles of wireless networks without being explicitly programmed. Our data is collected from a deployed network that was under full control of the measurement team and covered different vehicular scenarios and radio environments.
翻译:随着蜂窝网络向第六代(6G)发展,机器学习(ML)被视为提高网络能力的关键扶持技术。ML为预测系统提供了一种方法,而预测系统又能够使网络变得积极主动。网络的这种积极主动行为可以用来维持服务质量(QOS)的具体要求。随着预测服务质量(PQS)的提高,许多新的使用案例,包括安全和娱乐相关案例,正在出现,特别是在汽车部门。因此,我们考虑在这项工作中,最大限度地提高吞吐预测,例如,流流化或HD绘图应用程序。我们讨论整个ML工作流程,强调较少受到重视的方面,例如详细的取样程序、对数据集特征的深入分析、提供结果的分解以及数据的提供。可靠的ML模型在生命周期中需要面对许多挑战。我们强调,通过更好地了解所收集的数据的基本特征,可以建立对ML技术的信心。我们讨论的是,例如流压网络的预测情景以及不同分解模型的影响。我们通过多层次的网络的分解过程,可以显示我们通过多层次的分解模型来进行分解。