One of the widely used peak reduction methods in smart grids is demand response, where one analyzes the shift in customers' (agents') usage patterns in response to the signal from the distribution company. Often, these signals are in the form of incentives offered to agents. This work studies the effect of incentives on the probabilities of accepting such offers in a real-world smart grid simulator, PowerTAC. We first show that there exists a function that depicts the probability of an agent reducing its load as a function of the discounts offered to them. We call it reduction probability (RP). RP function is further parametrized by the rate of reduction (RR), which can differ for each agent. We provide an optimal algorithm, MJS--ExpResponse, that outputs the discounts to each agent by maximizing the expected reduction under a budget constraint. When RRs are unknown, we propose a Multi-Armed Bandit (MAB) based online algorithm, namely MJSUCB--ExpResponse, to learn RRs. Experimentally we show that it exhibits sublinear regret. Finally, we showcase the efficacy of the proposed algorithm in mitigating demand peaks in a real-world smart grid system using the PowerTAC simulator as a test bed.
翻译:在智能电网模拟器PowerTAC中广泛使用的峰值减少方法之一是需求响应,分析客户(代理人)的使用模式在响应分销公司的信号时的变化。这些信号通常以向代理人提供的奖励形式出现。这项工作研究奖励措施对在现实世界智能电网模拟器PowerTAC中接受这类报价的可能性的影响。我们首先显示存在一种功能,说明代理人减少其负载的可能性,这是向它们提供的折扣的函数。我们称之为降低概率(RP)。RP函数因每个代理人的减速率而进一步接近于每个代理人的减速率(RR)。我们提供了一种最佳算法,即MJS-Expresponse,通过在预算限制下最大限度地实现预期的减价,向每个代理人提供折扣。当RRM还不清楚时,我们提议一个基于多Armed Bandit(MAB)的在线算法,即MJSUCB-Exponse,以学习RPs。我们实验性地显示,它展示了低线系统,在降低温度时,我们展示了SlimAC的极限,我们展示了拟议电压的极限系统。</s>