深CC:通过多目标优化缩小拥挤控制和应用之间的差距 (DeepCC: Bridging the Gap Between Congestion Control and Applications via Multi-Objective Optimization)

The increasingly complicated and diverse applications have distinct network performance demands, e.g., some desire high throughput while others require low latency. Traditional congestion controls (CC) have no perception of these demands. Consequently, literatures have explored the objective-specific algorithms, which are based on either offline training or online learning, to adapt to certain application demands. However, once generated, such algorithms are tailored to a specific performance objective function. Newly emerged performance demands in a changeable network environment require either expensive retraining (in the case of offline training), or manually redesigning a new objective function (in the case of online learning). To address this problem, we propose a novel architecture, DeepCC. It generates a CC agent that is generically applicable to a wide range of application requirements and network conditions. The key idea of DeepCC is to leverage both offline deep reinforcement learning and online fine-tuning. In the offline phase, instead of training towards a specific objective function, DeepCC trains its deep neural network model using multi-objective optimization. With the trained model, DeepCC offers near Pareto optimal policies w.r.t different user-specified trade-offs between throughput, delay, and loss rate without any redesigning or retraining. In addition, a quick online fine-tuning phase further helps DeepCC achieve the application-specific demands under dynamic network conditions. The simulation and real-world experiments show that DeepCC outperforms state-of-the-art schemes in a wide range of settings. DeepCC gains a higher target completion ratio of application requirements up to 67.4% than that of other schemes, even in an untrained environment.

翻译：日益复杂和多样化的应用程序有不同的网络性能要求,例如,一些渴望较高的吞吐量,而另一些则需要较低的长期性。传统的拥堵控制(CC)没有对这些需求形成任何认识。因此,文献探讨了基于离线培训或在线学习的客观特定算法,以适应某些应用需求。然而,一旦产生了这种算法,这种算法就适合具体的绩效目标功能。在可改变的网络环境中,新出现的业绩需求需要要么昂贵的再培训(在离线培训的情况下),要么手工重新设计新的目标功能(在网上学习的情况下)。为了解决这个问题,我们提议了一个新的结构,DeepCC。它产生了一个通用适用于广泛的应用要求和网络条件的CCC代理法。DepCC的主要想法是利用离线深度强化学习和在线微调两种功能。在离线阶段中,DeepCC用多目标优化的方式培训其深层神经网络模式。在经过培训的模型中,DeepCC提供了接近Pareto 最佳的政策 w.r.t develop 范围应用。Dreal-revelop realal real develop reduvelop reduction laction develop laction develop roduvelop laction develop restraction detraction detraction detraction develop roduft roduft laction detraction develop roft se se se roduft se se se se detraction detraction detraction detraction detraction detraction detraction detraction detraction detraction detraction detraction detraction develd se se se se se se se se 任何不需clevelse develse 任何 se se se roduclex se roduction develse roducal detral detral detral se rol rol develse se se se se se se se se se se rol se se se se se se se se se se sel detral detral a se se se se se se se se