Network Traffic Classification (NTC) has become an important component in a wide variety of network management operations, e.g., Quality of Service (QoS) provisioning and security purposes. Machine Learning (ML) algorithms as a common approach for NTC methods can achieve reasonable accuracy and handle encrypted traffic. However, ML-based NTC techniques suffer from the shortage of labeled traffic data which is the case in many real-world applications. This study investigates the applicability of an active form of ML, called Active Learning (AL), which reduces the need for a high number of labeled examples by actively choosing the instances that should be labeled. The study first provides an overview of NTC and its fundamental challenges along with surveying the literature in the field of using ML techniques in NTC. Then, it introduces the concepts of AL, discusses it in the context of NTC, and review the literature in this field. Further, challenges and open issues in the use of AL for NTC are discussed. Additionally, as a technical survey, some experiments are conducted to show the broad applicability of AL in NTC. The simulation results show that AL can achieve high accuracy with a small amount of data.
翻译:网络交通分类(NTC)已成为各种网络管理业务的重要组成部分,例如,服务质量(QOS)提供和安全目的; 机器学习(ML)算法作为NTC方法的共同方法,可以达到合理的准确性并处理加密交通; 然而,基于ML的NTC技术由于许多现实应用软件的标签交通数据短缺而受到影响; 这项研究调查了一种称为积极学习(AL)的积极形式的ML的适用性,通过积极选择应标出的例子,减少了对大量有标签的例子的需要; 这项研究首先概述了NTC及其基本挑战,同时调查了NTC使用ML技术领域的文献; 然后,它介绍了AL的概念,在NTC中加以讨论,并审查了该领域的文献; 此外,还讨论了使用AL用于NTC的挑战和公开问题。 此外,在一项技术调查中,进行了一些试验,以显示AL在NTC中的广泛适用性。 模拟结果显示,AL可以用少量数据达到很高的准确性。