通用拉格朗编码计算:灵活计算-通信取舍 (Generalized Lagrange Coded Computing: A Flexible Computation-Communication Tradeoff)

We consider the problem of evaluating arbitrary multivariate polynomials over a massive dataset, in a distributed computing system with a master node and multiple worker nodes. Generalized Lagrange Coded Computing (GLCC) codes are proposed to provide robustness against stragglers who do not return computation results in time, adversarial workers who deliberately modify results for their benefit, and information-theoretic security of the dataset amidst possible collusion of workers. GLCC codes are constructed by first partitioning the dataset into multiple groups, and then encoding the dataset using carefully designed interpolation polynomials, such that interference computation results across groups can be eliminated at the master. Particularly, GLCC codes include the state-of-the-art Lagrange Coded Computing (LCC) codes as a special case, and achieve a more flexible tradeoff between communication and computation overheads in optimizing system efficiency.

翻译：我们考虑在分布式计算机系统中对大型数据集进行任意的多变量多元值评估的问题,该系统有一个主节点和多个工人节点。提议通用的拉格朗编码计算码码(GLCC)代码是为了对不及时返回计算结果的分解器、有意为自身利益修改结果的敌对工人和在工人可能串通的情况下对数据集进行信息理论安全性评估。 GLCC代码是通过首先将数据集分成多个组来构建的,然后用精心设计的内插多元点来编码数据集,这样就可以在主数中消除跨组的干扰计算结果。特别是,GLCC代码将最先进的拉格朗编码编码(LCC)代码作为一个特例,并在优化系统效率时在通信和计算间接费用之间实现更灵活的交换。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日