Electric vehicles have been rapidly increasing in usage, but stations to charge them have not always kept up with demand, so efficient routing of vehicles to stations is critical to operating at maximum efficiency. Deciding which stations to recommend drivers to is a complex problem with a multitude of possible recommendations, volatile usage patterns and temporally extended consequences of recommendations. Reinforcement learning offers a powerful paradigm for solving sequential decision-making problems, but traditional methods may struggle with sample efficiency due to the high number of possible actions. By developing a model that allows complex representations of actions, we improve outcomes for users of our system by over 30% when compared to existing baselines in a simulation. If implemented widely, these better recommendations can globally save over 4 million person-hours of waiting and driving each year.
翻译:电动车辆的使用迅速增加,但电动车辆收费的站点并不总是能跟上需求,因此将车辆高效地分送车站对于以最高效率运作至关重要。决定向哪些车站推荐司机是一个复杂问题,可能提出的建议很多,使用模式变化不定,而且建议的后果也长期存在。强化学习为解决连续决策问题提供了一个强有力的范例,但由于可能采取的行动很多,传统方法可能会在抽样效率方面遇到困难。通过开发一个允许复杂地展示行动的模式,我们比模拟中的现有基线将系统用户的成果提高30%以上。如果这些建议得到广泛实施,那么这些更好的建议每年可以在全球节省400多万人时的等候和驾驶费用。