With the rising demand for wireless services and increased awareness of the need for data protection, existing network traffic analysis and management architectures are facing unprecedented challenges in classifying and synthesizing the increasingly diverse services and applications. This paper proposes FS-GAN, a federated self-supervised learning framework to support automatic traffic analysis and synthesis over a large number of heterogeneous datasets. FS-GAN is composed of multiple distributed Generative Adversarial Networks (GANs), with a set of generators, each being designed to generate synthesized data samples following the distribution of an individual service traffic, and each discriminator being trained to differentiate the synthesized data samples and the real data samples of a local dataset. A federated learning-based framework is adopted to coordinate local model training processes of different GANs across different datasets. FS-GAN can classify data of unknown types of service and create synthetic samples that capture the traffic distribution of the unknown types. We prove that FS-GAN can minimize the Jensen-Shannon Divergence (JSD) between the distribution of real data across all the datasets and that of the synthesized data samples. FS-GAN also maximizes the JSD among the distributions of data samples created by different generators, resulting in each generator producing synthetic data samples that follow the same distribution as one particular service type. Extensive simulation results show that the classification accuracy of FS-GAN achieves over 20% improvement in average compared to the state-of-the-art clustering-based traffic analysis algorithms. FS-GAN also has the capability to synthesize highly complex mixtures of traffic types without requiring any human-labeled data samples.
翻译:随着对无线服务的需求不断增长,对数据保护需要的认识日益提高,现有的网络交通分析和管理结构在对日益多样化的服务和应用进行分类和综合方面正面临前所未有的挑战。本文件提出FS-GAN,这是一个联合自我监督的学习框架,用于支持对大量不同数据集进行自动交通分析和合成。FS-GAN由多种分布式的Genement Aversarial网络(GANs)组成,每个网络的设计都是在分发个别服务流量后生成综合数据样本,每个受培训者在区分综合数据样本和当地数据集真实数据样本方面正面临前所未有的挑战。采用了一个联合学习框架,以协调不同数据集之间不同GAN的当地模式培训进程。FS-GAN可以对未知服务类型的数据进行分类,并制作合成样本,记录未知类型的交通分布。我们证明,FS-GAN将基于JENS-Silververgence(JSD)在分配所有数据集成的准确数据样本中,所有数据样本中将实时数据样本的改进数据样本和结果的合成GSISAN的精确性分析结果也通过不同类型进行。