In recent years, the rapid rise of video applications has led to an explosion of Internet video traffic, thereby posing severe challenges to network management. Therefore, effectively identifying and managing video traffic has become an urgent problem to be solved. However, the existing video traffic feature extraction methods mainly target at the traditional packet and flow level features, and the video traffic identification accuracy is low. Additionally, the issue of high data dimension often exists in video traffic identification, requiring an effective approach to select the most relevant features to complete the identification task. Although numerous studies have used feature selection to achieve improved identification performance, no feature selection research has focused on measuring feature distributions that do not overlap or have a small overlap. First, this study proposes to extract video-related features to construct a large-scale feature set to identify video traffic. Second, to reduce the cost of video traffic identification and select an effective feature subset, the current research proposes an adaptive distribution distance-based feature selection (ADDFS) method, which uses Wasserstein distance to measure the distance between feature distributions. To test the effectiveness of the proposal, we collected a set of video traffic from different platforms in a campus network environment and conducted a set of experiments using these data sets. Experimental results suggest that the proposed method can achieve high identification performance for video scene traffic and cloud game video traffic identification. Lastly, a comparison of ADDFS with other feature selection methods shows that ADDFS is a practical feature selection technique not only for video traffic identification, but also for general classification tasks.
翻译:近年来,视频应用的迅速增加导致互联网视频传输的爆炸性,从而给网络管理带来严重挑战。因此,有效识别和管理视频传输已成为一个迫切需要解决的问题。然而,现有的视频传输特征提取方法主要针对传统包和流层特征,视频传输识别准确度较低。此外,高数据层面问题往往存在于视频传输识别中,需要采用有效方法选择最相关的特征来完成识别任务。虽然许多研究都利用特征选择来提高特征的识别性能,但没有进行专题选择,重点衡量不重叠或有少量重叠的特征分布。首先,本研究提议提取与视频相关的特征,以构建一个大型特征集来识别视频传输。第二,为降低视频传输识别成本并选择有效的特征集,当前研究建议采用适应性分布远程特征选择方法(ADDDFS),该方法仅使用瓦瑟斯坦距离来测量特征分布之间的距离。为测试建议的效果,我们从不同平台收集了一组不重叠或有少量重叠的图像选择。首先,我们从不同校园网络环境中环境收集了与视频相关特征的视频传播特征,并进行了一系列测试,用这些视频视频视频识别方法为DDFS选择了一张视频视频访问特征演示结果。</s>