An accountable distributed system provides means to detect deviations of system components from their expected behavior. It is natural to complement fault detection with a reconfiguration mechanism, so that the system could heal itself, by replacing malfunctioning parts with new ones. In this paper, we describe a framework that can be used to implement a large class of accountable and reconfigurable replicated services. We build atop the fundamental lattice agreement abstraction lying at the core of storage systems and cryptocurrencies. Our asynchronous implementation of accountable lattice agreement ensures that every violation of consistency is followed by an undeniable evidence of misbehavior of a faulty replica. The system can then be seamlessly reconfigured by evicting faulty replicas, adding new ones and merging inconsistent states. We believe that this paper opens a direction towards asynchronous "self-healing" systems that combine accountability and reconfiguration.
翻译:问责分布式系统为检测系统部件偏离预期行为提供了手段。 以重组机制补充故障检测是自然的,这样系统就能通过替换故障部件而自我治愈。 在本文中,我们描述了一个可用于实施大量问责和可重新配置的复制服务的框架。 我们建起一个基本拉蒂协议的抽象,它位于存储系统和加密的核心。 我们无懈可击地执行问责拉蒂协议可以确保每次违反一致性都会有不可否认的错误复制错误行为的证据。 然后,通过拆除错误复制品、添加新的复制品和合并不一致的状态,系统可以进行无缝的重组。 我们相信,该文件开启了向无节制的“自我保健”系统的方向,将问责和重组结合起来。