The distributed Volt/Var control (VVC) methods have been widely studied for active distribution networks(ADNs), which is based on perfect model and real-time P2P communication. However, the model is always incomplete with significant parameter errors and such P2P communication system is hard to maintain. In this paper, we propose an online multi-agent reinforcement learning and decentralized control framework (OLDC) for VVC. In this framework, the VVC problem is formulated as a constrained Markov game and we propose a novel multi-agent constrained soft actor-critic (MACSAC) reinforcement learning algorithm. MACSAC is used to train the control agents online, so the accurate ADN model is no longer needed. Then, the trained agents can realize decentralized optimal control using local measurements without real-time P2P communication. The OLDC with MACSAC has shown extraordinary flexibility, efficiency and robustness to various computing and communication conditions. Numerical simulations on IEEE test cases not only demonstrate that the proposed MACSAC outperforms the state-of-art learning algorithms, but also support the superiority of our OLDC framework in the online application.
翻译:在本文中,我们提议为VVC建立一个在线多试剂强化学习和分散控制框架(OLDC)。在这个框架内,VVC问题被设计成一个限制的Markov游戏,我们提议一种新型的多试剂软性行为者-加速强化学习算法。MACSCAC被用于在网上培训控制剂,因此准确的ADN模型不再需要。然后,经过培训的代理商可以在不实时P2P通信的情况下,利用当地测量实现分散最佳控制,而无需实时P2P通信。与MACSCAC一起的老化C在各种计算和通信条件方面表现出极大的灵活性、效率和稳健性。在IEEE测试案例中的数值模拟不仅表明拟议的MACSCAC(MACSCAC)超越了最新学习算法,而且还支持了我们在网上应用中的ASGC框架的优越性。