We study a well-known communication abstraction called Byzantine Reliable Broadcast (BRB). This abstraction is central in the design and implementation of fault-tolerant distributed systems, as many fault-tolerant distributed applications require communication with provable guarantees on message deliveries. Our study focuses on fault-tolerant implementations for message-passing systems that are prone to process-failures, such as crashes and malicious behavior. At PODC 1983, Bracha and Toueg, in short, BT, solved the BRB problem. BT has optimal resilience since it can deal with t < n/3 Byzantine processes, where n is the number of processes. The present work aims at the design of an even more robust solution than BT by expanding its fault-model with self-stabilization, a vigorous notion of fault-tolerance. In addition to tolerating Byzantine and communication failures, self-stabilizing systems can recover after the occurrence of arbitrary transient-faults. These faults represent any violation of the assumptions according to which the system was designed to operate (provided that the algorithm code remains intact). We propose, to the best of our knowledge, the first self-stabilizing Byzantine fault-tolerant (BFT) solution for repeated BRB in signature-free message-passing systems (that follows BT's problem specifications). Our contribution includes a self-stabilizing variation on a BT that solves a single-instance BRB for asynchronous systems. We also consider the problem of recycling instances of single-instance BRB. Our self-stabilizing BFT recycling for time-free systems facilitates the concurrent handling of a predefined number of BRB invocations and, by this way, can serve as the basis for self-stabilizing BFT consensus.
翻译:我们研究的是众所周知的通信抽象学,称为Byzantine Syrish Swear(BRB)。这种抽象学是设计和实施容错分配系统的核心,因为许多容错分布式应用程序需要以可识别的电文发送保证进行沟通。我们的研究侧重于对容易发生流程故障的电文传递系统实施容错执行,例如碰撞和恶意行为。在1983年的PoDC,Bcha和Toueg,简言之,BT解决了BRB的问题。BT具有最佳的复原力,因为它可以处理使用n/3 Byantine流程,而这是流程的数量。目前的工作目的是设计比BT系统更强有力的解决方案,通过自我稳定化来扩展其错误模式,这是一种强烈的容忍性概念。除了容忍Byzantine和通信故障之外,自我稳定系统也可以在任意的变换错误发生后恢复。这些错误代表了系统最初设计运行的假设(前提是算法规则不变 ) BRBRB 的自我稳定(我们提议, 自我稳定 ) 的自我稳定系统的最佳知识可以取代Bzal-Bzill 的自我定义的系统。