A linear block code with dimension $k$, length $n$, and minimum distance $d$ is called a locally repairable code (LRC) with locality $r$ if it can retrieve any coded symbol by at most $r$ other coded symbols. LRCs have been recently proposed and used in practice in distributed storage systems (DSSs) such as Windows Azure storage and Facebook HDFS-RAID. Theoretical bounds on the maximum locality of LRCs ($r$) have been established. The \textit{average} locality of an LRC ($\overline{r}$) directly affects the costly repair bandwidth, disk I/O, and number of nodes involved in the repair process of a missing data block. There is a gap in the literature studying $\overline{r}$. In this paper, we establish a lower bound on $\overline{r}$ of arbitrary $(n,k,d)$ LRCs. Furthermore, we obtain a tight lower bound on $\overline{r}$ for a practical case where the code rate $(R=\frac{k}{n})$ is greater than $(1-\frac{1}{\sqrt{n}})^2$. Finally, we design three classes of LRCs that achieve the obtained bounds on $\overline{r}$. Comparing with the existing LRCs, our proposed codes improve the average locality without sacrificing such crucial parameters as the code rate or minimum distance.
翻译:包含维度为美元、长度为美元和最低距离值的线性区块代码,如果能够以最多美元的其他编码符号检索任何编码符号,则称为当地可修理代码(LRC),如果它能够以其他编码符号以最多美元的形式检索任何编码符号,则称为当地可修理代码(LRC),最近曾提出LRC用于分布式储存系统(DSS),例如Windows Azure 存储和Facebook HDFS-RAID。已经确定了关于最大LRC(美元)(n,k,d)LRC(美元)的理论界限。此外,对于一个实际案例,LRC($(overline{r}%r}%O)的参数直接影响到成本昂贵的修理带宽度、磁盘 I/O和缺失数据区块修复过程中涉及的节点数目。在文献中存在一个差距,研究$(overline{r}(DS) $(n,k,d)$(美元)的最大值为LRC(r_r_r_r_r_r_r_r_rxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx