To integrate high amounts of renewable energy resources, electrical power grids must be able to cope with high amplitude, fast timescale variations in power generation. Frequency regulation through demand response has the potential to coordinate temporally flexible loads, such as air conditioners, to counteract these variations. Existing approaches for discrete control with dynamic constraints struggle to provide satisfactory performance for fast timescale action selection with hundreds of agents. We propose a decentralized agent trained with multi-agent proximal policy optimization with localized communication. We explore two communication frameworks: hand-engineered, or learned through targeted multi-agent communication. The resulting policies perform well and robustly for frequency regulation, and scale seamlessly to arbitrary numbers of houses for constant processing times.
翻译:为了整合大量可再生能源资源,电力网必须能够应对发电方面的高振幅和快速时间尺度变化。通过需求调节的频率调节有可能协调时间灵活的负荷(如空调机),以抵消这些变化。现有的具有动态制约的离散控制办法难以为与数百个代理商的快速时标行动选择提供令人满意的业绩。我们建议建立一个分散化的代理商,通过本地通信进行多试剂准政策优化培训。我们探索两个通信框架:手动设计,或通过有针对性的多剂通信学习。由此产生的政策在频率调节方面运行良好和稳健,在连续处理时间将任意数量的房屋规模完全扩大到任意数量的房屋。