This paper studies a fundamental problem regarding the security of blockchain PoW consensus on how the existence of multiple misbehaving miners influences the profitability of selfish mining. Each selfish miner (or attacker interchangeably) maintains a private chain and makes it public opportunistically for acquiring more rewards incommensurate to his Hash power. We first establish a general Markov chain model to characterize the state transition of public and private chains for Basic Selfish Mining (BSM), and derive the stationary profitable threshold of Hash power in closed-form. It reduces from 25% for a single attacker to below 21.48% for two symmetric attackers theoretically, and further reduces to around 10% with eight symmetric attackers experimentally. We next explore the profitable threshold when one of the attackers performs strategic mining based on Partially Observable Markov Decision Process (POMDP) that only half of the attributes pertinent to a mining state are observable to him. An online algorithm is presented to compute the nearly optimal policy efficiently despite the large state space and high dimensional belief space. The strategic attacker mines selfishly and more agilely than BSM attacker when his Hash power is relatively high, and mines honestly otherwise, thus leading to a much lower profitable threshold. Last, we formulate a simple model of absolute mining revenue that yields an interesting observation: selfish mining is never profitable at the first difficulty adjustment period, but replying on the reimbursement of stationary selfish mining gains in the future periods. The delay till being profitable of an attacker increases with the decrease of his Hash power, making blockchain miners more cautious on performing selfish mining.
翻译:本文研究了关于区块链PoW共识安全性的一个基本问题,即多个恶意矿工存在时如何影响自私挖矿的盈利能力。每个自私矿工(或攻击者)都维护一个私有链,并在 opportunistically 公开它,以获得与他的 Hash 功率不成比例的更多奖励。我们首先建立了一个通用的 Markov 链模型,以描述基本自私挖掘(BSM)公共和私有链的状态转换,并导出封闭形式下的 Hash 功率的稳态盈利门槛。理论上它从单个攻击者的 25% 减少到两个对称攻击者的 21.48% 以下,并且在实验上进一步减少到大约 10% 与八个对称攻击者。接下来,我们研究了当攻击者通过部分可观测马尔可夫决策过程(POMDP)进行战略挖掘时的盈利门槛,只有一半与挖掘状态有关的属性对他是可观测到的。尽管状态空间和高维信任空间很大,我们提出了一种在线算法来高效计算近似最优策略。当他的 Hash 功率相对较高时,战略性攻击者进行自私和更灵活的挖掘,否则进行诚实挖掘,从而导致更低的盈利门槛。最后,我们提出了一个绝对挖掘收益的简单模型,得出一个有趣的观察结果:自私挖掘在第一个 Schwierigkeit 调整期间从未盈利,但依靠未来周期内的稳态自私挖掘收益的补偿。攻击者的盈利延迟随着他的 Hash 功率的降低而增加,使区块链矿工更加谨慎地进行自私挖掘。