This paper proposes a new state transfer method for geographic state machine replication (SMR) that dynamically allocates the state to be transferred among replicas according to changes in communication bandwidths. SMR improves fault tolerance by replicating a service to multiple replicas. When a replica is newly added or recovered from a failure, the other replicas transfer the current state of the service to it. However, in geographic SMR, the communication bandwidths of replicas are different and constantly changing. Therefore, existing state transfer methods cannot fully utilize the available bandwidth, and their state transfer time increases. To overcome this problem, our method divides the state into multiple chunks and assigns them to replicas based on each replica's bandwidth so that the broader a replica's bandwidth is, the more chunks it transfers. The proposed method also updates the chunk assignment of each replica dynamically based on the currently estimated bandwidth. The performance evaluation on Amazon EC2 shows that the proposed method reduces the state transfer time by up to 47% compared to the existing one. In addition, we apply the proposed method to dynamic replacement of replicas, which can mitigate latency degradation caused by network trouble, and evaluate how fast the method can relocate a replica.
翻译:本文提出一个新的地理州机器复制州传输方法(SMR),该方法根据通信带宽的变化动态地分配拟在复制品中转让的状态。 SMR通过将服务复制到多个复制品,提高故障容忍度。当复制品被新添加或从失败中回收时,其他复制品将当前服务状态转移给它。然而,在地理州机器复制品的通信带宽不同且不断变化。因此,在地区机器复制品的通信带宽上,现有的州传输方法无法充分利用可用带宽,而且其状态传输时间会增加。为了克服这一问题,我们的方法将状态分为多个块,并指派它们根据每个复制品的带宽进行复制,这样范围更广的复制带宽就会增加。提议的方法是根据目前估计的带宽动态更新每项复制品的批次分配。亚马逊EC2的绩效评估表明,拟议方法将国家传输时间比现有方法减少47%。此外,我们采用拟议的方法来动态替换复制品,这样可以快速地改变网络的退化。