Payment channel networks (PCNs) are a layer-2 blockchain scalability solution, with its main entity, the payment channel, enabling transactions between pairs of nodes "off-chain," thus reducing the burden on the layer-1 network. Nodes with multiple channels can serve as relays for multihop payments over a path of channels: they relay payments of others by providing the liquidity of their channels, in exchange for part of the amount withheld as a fee. Relay nodes might after a while end up with one or more unbalanced channels, and thus need to trigger a rebalancing operation. In this paper, we study how a relay node can maximize its profits from fees by using the rebalancing method of submarine swaps. We introduce a stochastic model to capture the dynamics of a relay node observing random transaction arrivals and performing occasional rebalancing operations, and express the system evolution as a Markov Decision Process. We formulate the problem of the maximization of the node's fortune over time over all rebalancing policies, and approximate the optimal solution by designing a Deep Reinforcement Learning (DRL)-based rebalancing policy. We build a discrete event simulator of the system and use it to demonstrate the DRL policy's superior performance under most conditions by conducting a comparative study of different policies and parameterizations. In all, our approach aims to be the first to introduce DRL for network optimization in the complex world of PCNs.
翻译:支付渠道网络(PCNs)是一层至层-2的连锁可伸缩性解决方案,其主实体是支付渠道,使双节节点“离链”之间能够进行交易,从而减轻分层-1网络的负担。多频道的节点可以作为多点支付渠道路径的中继器:它们通过提供其渠道的流动性,转而支付其他渠道的付款,以换取部分留款额作为交换。中继节点可能在一个或更多不平衡的渠道之后,从而需要启动重新平衡操作。在本文件中,我们研究中继节点如何利用潜艇互换的平衡方法,最大限度地增加收费收益。我们采用了一个随机交易到达的中继节点的动态,并偶尔进行再平衡操作,并将系统演化作为马尔科夫决定程序的一部分。我们提出了在所有重新平衡政策中最大限度地增加节点的财富的问题,并且通过设计基于深度强化学习(DRL)的再平衡政策,来寻找最佳解决办法。我们为在进行最复杂的节点的节点模拟活动,在进行最高级的节点的节点化政策中,在进行我们进行最高级的节点的节点的节点化政策研究时要展示。