With the recent advances in mobile energy storage technologies, electric vehicles (EVs) have become a crucial part of smart grids. When EVs participate in the demand response program, the charging cost can be significantly reduced by taking full advantage of the real-time pricing signals. However, many stochastic factors exist in the dynamic environment, bringing significant challenges to design an optimal charging/discharging control strategy. This paper develops an optimal EV charging/discharging control strategy for different EV users under dynamic environments to maximize EV users' benefits. We first formulate this problem as a Markov decision process (MDP). Then we consider EV users with different behaviors as agents in different environments. Furthermore, a horizontal federated reinforcement learning (HFRL)-based method is proposed to fit various users' behaviors and dynamic environments. This approach can learn an optimal charging/discharging control strategy without sharing users' profiles. Simulation results illustrate that the proposed real-time EV charging/discharging control strategy can perform well among various stochastic factors.
翻译:随着移动能源储存技术的最近进展,电动车辆已成为智能电网的一个关键部分。当EV参与需求响应程序时,通过充分利用实时定价信号,收费成本可以大幅降低。然而,动态环境中存在许多随机因素,给设计最佳充电/冷却控制战略带来了重大挑战。本文为动态环境中的不同EV用户制定了最佳的EV充电/分散控制战略,以最大限度地增加EV用户的效益。我们首先将这一问题表述为Markov决定程序(MDP ) 。然后我们将不同行为不同的EV用户视为不同环境中的代理。此外,还提出了一种横向联合强化学习法,以适应不同用户的行为和动态环境。这种方法可以在不分享用户概况的情况下学习最佳充电/分散控制战略。模拟结果表明,拟议的实时EV充电/分散控制战略可以在不同环境中很好地发挥作用。