We study the problem of efficiently computing on encoded data. More specifically, we study the question of low-bandwidth computation of functions $F:\mathbb{F}^k \to \mathbb{F}$ of some data $x \in \mathbb{F}^k$, given access to an encoding $c \in \mathbb{F}^n$ of $x$ under an error correcting code. In our model -- relevant in distributed storage, distributed computation and secret sharing -- each symbol of $c$ is held by a different party, and we aim to minimize the total amount of information downloaded from each party in order to compute $F(x)$. Special cases of this problem have arisen in several domains, and we believe that it is fruitful to study this problem in generality. Our main result is a low-bandwidth scheme to compute linear functions for Reed-Solomon codes, even in the presence of erasures. More precisely, let $\epsilon > 0$ and let $\mathcal{C}: \mathbb{F}^k \to \mathbb{F}^n$ be a full-length Reed-Solomon code of rate $1 - \epsilon$ over a field $\mathbb{F}$ with constant characteristic. For any $\gamma \in [0, \epsilon)$, our scheme can compute any linear function $F(x)$ given access to any $(1 - \gamma)$-fraction of the symbols of $\mathcal{C}(x)$, with download bandwidth $O(n/(\epsilon - \gamma))$ bits. In contrast, the naive scheme that involves reconstructing the data $x$ and then computing $F(x)$ uses $\Theta(n \log n)$ bits. Our scheme has applications in distributed storage, coded computation, and homomorphic secret sharing.
翻译:我们研究在编码数据中高效计算的问题。 更具体地说, 我们研究低频度计算函数 $F:\ mathbb{ F\ k\ to\ mathb{ F} 一些数据$x$的美元 $, $ mathb{ F\\ k$, 使用一个编码 $c\ in\ mathb{ F\\\\\ n美元 美元 校正代码。 在我们模型中 -- 与分布存储、 分配计算和秘密共享相关 -- 每种美元值的数值由不同的方持有, 我们的目标是将从每个方下载的信息总量减少到最小值 $F(xb) 。 这个问题的特殊案例出现在几个域中, 我们认为, 以一般方式研究这个问题是有成效的。 我们的主要结果是低频度计算 Reed- Solomon 的线性函数, 即使存在消除功能 。 更精确地说, 任何( 美元) 以美元==美元 和 美元 直径=xxxxxxxx 数据流 。