With the rising extension of renewable energies, the intraday electricity markets have recorded a growing popularity amongst traders as well as electric utilities to cope with the induced volatility of the energy supply. Through their short trading horizon and continuous nature, the intraday markets offer the ability to adjust trading decisions from the day-ahead market or reduce trading risk in a short-term notice. Producers of renewable energies utilize the intraday market to lower their forecast risk, by modifying their provided capacities based on current forecasts. However, the market dynamics are complex due to the fact that the power grids have to remain stable and electricity is only partly storable. Consequently, robust and intelligent trading strategies are required that are capable to operate in the intraday market. In this work, we propose a novel autonomous trading approach based on Deep Reinforcement Learning (DRL) algorithms as a possible solution. For this purpose, we model the intraday trade as a Markov Decision Problem (MDP) and employ the Proximal Policy Optimization (PPO) algorithm as our DRL approach. A simulation framework is introduced that enables the trading of the continuous intraday price in a resolution of one minute steps. We test our framework in a case study from the perspective of a wind park operator. We include next to general trade information both price and wind forecasts. On a test scenario of German intraday trading results from 2018, we are able to outperform multiple baselines with at least 45.24% improvement, showing the advantage of the DRL algorithm. However, we also discuss limitations and enhancements of the DRL agent, in order to increase the performance in future works.
翻译:随着可再生能源的不断扩展,当日电力市场在贸易商和电力公用事业中越来越受欢迎,以应对能源供应的波动。因此,需要强有力和智能的贸易战略,以便能够在日常市场上运作。在这项工作中,我们提出基于深加学习(DRL)算法的新颖的自主贸易方法,作为可能的解决方案。为此,我们将日内贸易作为马可夫决策问题(MDP)模型,并采用普罗克西马政策优化算法,作为我们的DRL方法。我们引入了一个模拟框架,以便能够在一天内交易价格上持续提高,从而能够在当天内市场中运作。我们提出基于深加学习(DRL)算法的新自主贸易方法,作为可能的解决方案。为此,我们将日内贸易作为马可夫决策问题(MDP)的模型来降低其预测风险,并采用普罗克西马氏政策优化算法(PPPO)的算法,作为我们的DRL方法。我们引入了一个模拟框架,使我们能在一天内不断提高价格,从一天内增长一个步骤中,我们从每天测试一个运算算算算算出一个标准。我们从一个标准,我们从一个总的汇率的汇率的汇率,我们从一个测试一个测试框架,我们从每天的汇率的汇率中将检验底价值到一个测试一个比。