Various congestion control protocols have been designed to achieve high performance in different network environments. Modern online learning solutions that delegate the congestion control actions to a machine cannot properly converge in the stringent time scales of data centers. We leverage multiagent reinforcement learning to design a system for dynamic tuning of congestion control parameters at end-hosts in a data center. The system includes agents at the end-hosts to monitor and report the network and traffic states, and agents to run the reinforcement learning algorithm given the states. Based on the state of the environment, the system generates congestion control parameters that optimize network performance metrics such as throughput and latency. As a case study, we examine BBR, an example of a prominent recently-developed congestion control protocol. Our experiments demonstrate that the proposed system has the potential to mitigate the problems of static parameters.
翻译:为了在不同网络环境中实现高性能,设计了各种拥堵控制协议。将拥堵控制行动委托给机器的现代在线学习解决方案无法在数据中心的严格时间尺度中适当整合。我们利用多剂强化学习来设计一个在数据中心终端主机对拥堵控制参数进行动态调控的系统。这个系统包括终端主机的代理商来监测和报告网络和交通状况,以及运行各州提供的强化学习算法的代理商。根据环境状况,这个系统产生拥堵控制参数,优化网络性能指标,如吞吐量和延缓度。作为案例研究,我们检查BBR,这是最近开发的一个突出的拥堵控制协议的一个例子。我们的实验表明,拟议的系统有可能缓解静态参数的问题。