Sketching is one of the most fundamental tools in large-scale machine learning. It enables runtime and memory saving via randomly compressing the original large problem onto lower dimensions. In this paper, we propose a novel sketching scheme for the first order method in large-scale distributed learning setting, such that the communication costs between distributed agents are saved while the convergence of the algorithms is still guaranteed. Given gradient information in a high dimension $d$, the agent passes the compressed information processed by a sketching matrix $R\in \R^{s\times d}$ with $s\ll d$, and the receiver de-compressed via the de-sketching matrix $R^\top$ to ``recover'' the information in original dimension. Using such a framework, we develop algorithms for federated learning with lower communication costs. However, such random sketching does not protect the privacy of local data directly. We show that the gradient leakage problem still exists after applying the sketching technique by showing a specific gradient attack method. As a remedy, we prove rigorously that the algorithm will be differentially private by adding additional random noises in gradient information, which results in a both communication-efficient and differentially private first order approach for federated learning tasks. Our sketching scheme can be further generalized to other learning settings and might be of independent interest itself.
翻译:切换是大规模机器学习中最基本的工具之一。 它可以通过随机将原始的大问题压缩到较低维度上, 节省时间和记忆。 在本文中, 我们为大规模分布式学习环境中的第一顺序方法提出一个新的草图方案, 这样可以节省分布式代理器之间的通信成本, 同时仍然保证算法的趋同。 由于高维度的梯度信息, 代理器通过一个草图矩阵( $R\ in\ R ⁇ s\ times d} $s\ll d$) 处理的压缩信息。 它可以节省时间和记忆。 通过解剖矩阵将接收器解压缩到较低维度上。 在原始维度上, 我们提出一个新的草图计划, 我们开发的算法将会通过增加私人通缩缩放矩阵来区分开端 。 使用这样一个框架, 我们开发的缩放问题仍然存在, 通过显示特定的梯度攻击方法, 作为一种补救, 我们证明这个算法会比较严格地是, 通过增加私人通缩方式, 增加私人通缩方式的利率, 来进一步学习普通化 。