We consider replication-based distributed storage systems in which each node stores the same quantum of data and each data bit stored has the same replication factor across the nodes. Such systems are referred to as \textit{balanced distributed databases}. When existing nodes leave or new nodes are added unto this system, the balanced nature of the database is lost, either due to the reduction in the replication factor, or due to non-uniformity of the storage at the nodes. This triggers a \textit{rebalancing} algorithm, which exchanges data between the nodes so that the balance of the database is reinstated. In a recent work by Krishnan et al., coded transmissions were used to rebalance a carefully designed distributed database from a node removal or addition. These coded rebalancing schemes have optimal communication load, however require the file-size to be at least exponential in the system parameters. In this work, we consider a \textit{cyclic balanced database} (where data is cyclically placed in the system nodes) and present coded rebalancing schemes for node removal and addition in such a database. These databases (and the associated rebalancing schemes) require the file-size to be only \textit{cubic} in the number of nodes in the system. We show that the communication load of the node-removal rebalancing scheme is strictly smaller than the load of the uncoded scheme. In the node addition scenario, the rebalancing scheme presented is a simple uncoded scheme, which we show has optimal load.
翻译:我们考虑的是基于复制的分布式存储系统,其中每个节点储存的数据量和每个数据点储存的数据量在节点中具有相同的复制系数。这些系统被称为\ textit{ 平衡分布式数据库}。当将现有的节点休假或新的节点添加到这个系统中时,数据库的平衡性就会丧失,要么是由于复制系数的减少,要么是由于节点存储的不一致性。这触发了一个\ textit{rebalance}算法,在节点之间交换数据,从而恢复数据库的平衡。在克里希南等人最近的一项工作中,使用编码传输来重新平衡一个精心设计的分布式数据库,从节点删除或添加。这些编码的重新平衡方案具有最佳的通信负荷,但要求文件大小至少在系统参数中变速。在这项工作中,我们考虑的是\ text{ 循环平衡数据库} (在系统节点中列报的数据是周期性的,在系统节点中, 并且目前为节点删除和添加的编码再平衡方案。 这些编码传输模式(和优化的调整计划在系统中) 显示的是不透化的系统中, 我们的系统中没有显示不透化的系统中的系统。