Network Traffic Classification (NTC) has become an important feature in various network management operations, e.g., Quality of Service (QoS) provisioning and security services. Machine Learning (ML) algorithms as a popular approach for NTC can promise reasonable accuracy in classification and deal with encrypted traffic. However, ML-based NTC techniques suffer from the shortage of labeled traffic data which is the case in many real-world applications. This study investigates the applicability of an active form of ML, called Active Learning (AL), in NTC. AL reduces the need for a large number of labeled examples by actively choosing the instances that should be labeled. The study first provides an overview of NTC and its fundamental challenges along with surveying the literature on ML-based NTC methods. Then, it introduces the concepts of AL, discusses it in the context of NTC, and review the literature in this field. Further, challenges and open issues in AL-based classification of network traffic are discussed. Moreover, as a technical survey, some experiments are conducted to show the broad applicability of AL in NTC. The simulation results show that AL can achieve high accuracy with a small amount of data.
翻译:网络交通分类(NTC)已成为各种网络管理业务的一个重要特征,例如,服务质量(QOS)提供和安全服务; 机器学习(ML)算法作为NTC的流行方法,可以保证分类合理准确和处理加密交通; 然而,基于ML的NTC技术因许多现实应用中缺少标签交通数据而受到影响; 本研究报告调查了在NTC中称为积极学习(AL)的一种积极形式的ML的适用性。 AL通过积极选择应标出的例子,减少了大量贴标签的例子的必要性。 这项研究首先概述了NTC及其基本挑战,同时调查以ML为基础的NTC方法的文献。然后,它介绍了AL的概念,在NTC中加以讨论,并审查这一领域的文献。此外,还讨论了AL网络交通分类方面的挑战和公开问题。此外,由于技术调查,进行了一些实验,以显示AL在NTC中的广泛适用性。 模拟结果显示,AL能够以少量数据实现高度准确性。