Distributed energy resources (DERs), such as rooftop solar panels, are growing rapidly and are reshaping power systems. To promote DERs, feed-in-tariff (FIT) is usually adopted by utilities to pay DER owners certain fixed rates for supplying energy to the grid. An alternative to FIT is a market-based approach; that is, consumers and DER owners trade energy in an auction-based peer-to-peer (P2P) market, and the rates are determined based on supply and demand. However, the auction complexity and market participants' bounded rationality may invalidate many well-established theories on auction design and hinder market development. To address the challenges, we propose an automated bidding framework based on multi-agent, multi-armed bandit learning for repeated auctions, which aims to minimize each bidder's cumulative regret. Numerical results indicate convergence of such a multi-agent learning game to a steady-state. Being particularly interested in auction designs, we have applied the framework to four different implementations of repeated double-side auctions to compare their market outcomes. While it is difficult to pick a clear winner, $k$-double auction (a variant of uniform pricing auction) and McAfee auction (a variant of Vickrey double-auction) appear to perform well in general, with their respective strengths and weaknesses.
翻译:为促进DERs,公用事业公司通常采用上网上网,向DERs支付向电网供应能源的某些固定费率。FIT的替代办法是一种基于市场的办法;即消费者和DERs所有者在以拍卖为基础的同行对等(P2P)市场中交易能源,费率根据供求情况决定。然而,拍卖的复杂性和市场参与者受约束的合理性可能会使关于拍卖设计的许多既定理论失效,并阻碍市场发展。为了应对挑战,我们提议一个自动招标框架,其基础是多次拍卖的多试剂、多武装的黑手党学习,目的是最大限度地减少每个投标人累积的遗憾。数字结果显示,这种多代理人学习游戏与稳定状态相融合。我们特别感兴趣的是拍卖设计,我们应用这一框架来进行四次不同的双面反复拍卖,以比较其市场结果。虽然很难在一次拍卖中挑选出一个明确的赢家、美元和各自的拍卖额。