Optimizing prices for energy demand response requires a flexible controller with ability to navigate complex environments. We propose a reinforcement learning controller with surprise minimizing modifications in its architecture. We suggest that surprise minimization can be used to improve learning speed, taking advantage of predictability in peoples' energy usage. Our architecture performs well in a simulation of energy demand response. We propose this modification to improve functionality and save in a large scale experiment.
翻译:优化能源需求反应价格需要有一个能够导航复杂环境的灵活控制器。我们提议一个强化学习控制器,其结构的修改将令人惊讶地最小化。我们建议,利用人们能源使用方面的可预测性,将意外最小化用于提高学习速度。我们的建筑在模拟能源需求反应中表现良好。我们建议进行这种修改,以改进功能,在大规模试验中节省。