Reinforcement learning (RL) techniques have been developed to optimize industrial cooling systems, offering substantial energy savings compared to traditional heuristic policies. A major challenge in industrial control involves learning behaviors that are feasible in the real world due to machinery constraints. For example, certain actions can only be executed every few hours while other actions can be taken more frequently. Without extensive reward engineering and experimentation, an RL agent may not learn realistic operation of machinery. To address this, we use hierarchical reinforcement learning with multiple agents that control subsets of actions according to their operation time scales. Our hierarchical approach achieves energy savings over existing baselines while maintaining constraints such as operating chillers within safe bounds in a simulated HVAC control environment.
翻译:为了优化工业冷却系统,开发了强化学习技术,与传统的超常政策相比,提供了大量节能。工业控制的一个主要挑战是学习由于机械限制而在现实世界中可行的行为。例如,某些行动只能每隔几个小时执行,而其他行动则可以更频繁地采取。没有广泛的奖励工程和实验,一个RL代理可能无法学习机械的实际操作。为了解决这个问题,我们利用与多个代理的等级强化学习,根据操作时间尺度控制行动组别。我们的等级方法在现有的基线上实现了节能,同时在模拟的HVAC控制环境中维持了操作冷却器等限制。