Partial synchrony is a model of computation in many distributed algorithms and modern blockchains. These algorithms are typically parameterized in the number of participants, and their correctness requires the existence of bounds on message delays and on the relative speed of processes after reaching Global Stabilization Time. These characteristics make partially synchronous algorithms parameterized in the number of processes, and parametric in time bounds, which render automated verification of partially synchronous algorithms challenging. In this paper, we present a case study on formal verification of both safety and liveness of the Chandra and Toueg failure detector that is based on partial synchrony. To this end, we first introduce and formalize the class of symmetric point-to-point algorithms that contains the failure detector. Second, we show that these symmetric point-to-point algorithms have a cutoff, and the cutoff results hold in three models of computation: synchrony, asynchrony, and partial synchrony. As a result, one can verify them by model checking small instances, but the verification problem stays parametric in time. Next, we specify the failure detector and the partial synchrony assumptions in three frameworks: TLA+, IVy, and counter automata. Importantly, we tune our modeling to use the strength of each method: (1) We are using counters to encode message buffers with counter automata, (2) we are using first-order relations to encode message buffers in IVy, and (3) we are using both approaches in TLA+. By running the tools for TLA+ and counter automata, we demonstrate safety for fixed time bounds. By running IVy, we prove safety for arbitrary time bounds. Moreover, we show how to verify liveness of the failure detector by reducing the verification problem to safety verification. Thus, both properties are verified by developing inductive invariants with IVy.
翻译:部分同步是许多分布式算法和现代条块链中的一种计算模型。 这些计算法通常在参与者数量上进行参数化, 其正确性要求存在信息延迟和到达全球稳定时间后流程相对速度的界限。 这些特性使得进程数量中部分同步的算法和时间界限中的参数化部分同步, 使得部分同步算法的自动核查具有挑战性。 在本文中, 我们提出了一个关于对 Chandra 和 Toueg 故障检测器的安全和活性进行正式核查的案例研究, 其基础是部分同步。 为此, 我们首先引入并正式确定包含故障检测器的对称点对点算算法。 其次, 我们显示这些对点对点的参数参数参数化参数性, 使对部分同步算法进行自动核查, 使得对部分同步性算法进行自动核查。 由此, 我们通过模拟检查小程序来验证它们的安全性和活性方法, 但核查问题在时间里保持对准 。 下一步, 我们用失败检测和部分同步的自动测算法 显示我们使用自动测算法 。