In this paper, multi-agent reinforcement learning is used to control a hybrid energy storage system working collaboratively to reduce the energy costs of a microgrid through maximising the value of renewable energy and trading. The agents must learn to control three different types of energy storage system suited for short, medium, and long-term storage under fluctuating demand, dynamic wholesale energy prices, and unpredictable renewable energy generation. Two case studies are considered: the first looking at how the energy storage systems can better integrate renewable energy generation under dynamic pricing, and the second with how those same agents can be used alongside an aggregator agent to sell energy to self-interested external microgrids looking to reduce their own energy bills. This work found that the centralised learning with decentralised execution of the multi-agent deep deterministic policy gradient and its state-of-the-art variants allowed the multi-agent methods to perform significantly better than the control from a single global agent. It was also found that using separate reward functions in the multi-agent approach performed much better than using a single control agent. Being able to trade with the other microgrids, rather than just selling back to the utility grid, also was found to greatly increase the grid's savings.
翻译:在本文中,多试剂强化学习用于控制混合能源储存系统,通过最大程度的可再生能源价值和贸易,合作降低微电网的能源成本。代理商必须学会控制三种不同类型的能源储存系统,这三种能源储存系统适合在波动的需求、动态的批发能源价格和不可预测的可再生能源生产下进行短期、中期和长期储存。考虑了两个案例研究:第一种研究能源储存系统如何在动态定价下更好地整合可再生能源的产生;第二种研究研究研究研究的是,如何在多试剂方法中使用单独的奖励功能比使用单一控制剂要好得多。这项工作发现,通过分散实施多试剂深度政策梯度及其最先进的变异形式,集中学习使多试剂方法的运作大大好于对单一全球代理商的控制。还发现,在多试剂方法中使用单独的奖励功能比使用单一控制剂要好得多。能够与其他微电网进行交易,而不是仅仅向公用电网出售,从而大大提高了储蓄率。