Internet services have led to the eruption of network traffic, and machine learning on these Internet data has become an indispensable tool, especially when the application is risk-sensitive. This paper focuses on network traffic classification in the presence of class imbalance, which fundamentally and ubiquitously exists in Internet data analysis. This existence of class imbalance mostly drifts the optimal decision boundary and results in a less optimal solution. This brings severe safety concerns in the network traffic field when pattern recognition is challenging with numerous minority malicious classes. To alleviate these effects, we design a \textit{group \& reweight} strategy for alleviating the class imbalance. Inspired by the group distributionally optimization framework, our approach heuristically clusters classes into groups, iteratively updates the non-parametric weights for separate classes and optimizes the learning model by minimizing reweighted losses. We theoretically interpret the optimization process from a Stackelberg game and perform extensive experiments on typical benchmarks. Results show that our approach can not only suppress the negative effect of class imbalance but also improve the comprehensive performance in prediction.
翻译:暂无翻译