Erasure coding has been recognized as a powerful method to mitigate delays due to slow or straggling nodes in distributed systems. This work shows that erasure coding of data objects can flexibly handle skews in the request rates. Coding can help boost the \emph{service rate region}, that is, increase the overall volume of data access requests that the system can handle. This paper aims to postulate the service rate region as an important consideration in the design of erasure-coded distributed systems. We highlight several open problems that can be grouped into two broad threads: 1) characterizing the service rate region of a given code and finding the optimal request allocation, and2) designing the underlying erasure code for a given service rate region. As contributions along the first thread, we characterize the rate regions of maximum-distance-separable, locally repairable, and Simplex codes. We show the effectiveness of hybrid codes that combine replication and erasure coding in terms of code design. We also discover fundamental connections between multi-set batch codes and the problem of maximizing the service rate region.
翻译:断层编码已被公认为是缓解因分布式系统中的慢节点或断裂节点而造成延误的有力方法。 这项工作表明, 数据对象的删除编码可以灵活地处理请求率中的偏斜值。 编码可以帮助提升 emph{ 服务率区域}, 即增加系统能够处理的数据访问请求总量。 本文旨在将服务率区域作为设计删除编码分布式系统的一个重要考虑因素。 我们强调几个可以归为两个大线的开放问题:1) 将特定代码的服务率区域定性并找到最佳请求分配,2) 设计特定服务率区域的基本取消代码。 作为第一线上的贡献, 我们确定最大距离可分隔、 本地可修理和简单x 代码的费率区域。 我们展示了混合代码的有效性, 将复制和取消代码设计中的编码结合起来。 我们还发现了多套集码与服务率区域最大化问题之间的基本联系。