The rapid adoption of electric vehicles (EVs) calls for the widespread installation of EV charging stations. To maximize the profitability of charging stations, intelligent controllers that provide both charging and electric grid services are in great need. However, it is challenging to determine the optimal charging schedule due to the uncertain arrival time and charging demands of EVs. In this paper, we propose a novel centralized allocation and decentralized execution (CADE) reinforcement learning (RL) framework to maximize the charging station's profit. In the centralized allocation process, EVs are allocated to either the waiting or charging spots. In the decentralized execution process, each charger makes its own charging/discharging decision while learning the action-value functions from a shared replay memory. This CADE framework significantly improves the scalability and sample efficiency of the RL algorithm. Numerical results show that the proposed CADE framework is both computationally efficient and scalable, and significantly outperforms the baseline model predictive control (MPC). We also provide an in-depth analysis of the learned action-value function to explain the inner working of the reinforcement learning agent.
翻译:迅速采用电动车辆(EV)要求广泛安装EV充电站。为了最大限度地提高充电站的获利能力,提供充电和电网服务的智能控制器非常需要。然而,由于到货时间不确定和EV要求收费,确定最佳充电时间表是一项艰巨的任务。在本文件中,我们提议建立一个新的中央分配和分散执行强化学习框架,以最大限度地扩大电荷站的利润。在集中分配过程中,EV被分配到等待点或充电点。在分散执行过程中,每个充电/分散决定,同时从共享的重放记忆中学习行动价值功能。CADE框架大大提高了REL算法的可缩放性和抽样效率。数字结果显示,拟议的CADE框架既具有计算效率和可缩放性,也大大超出基线模型预测控制(MPC)。我们还深入分析了学到的行动价值功能,以解释加固学习剂的内部工作。