This report considers the problem of resilient distributed optimization and stochastic learning in a server-based architecture. The system comprises a server and multiple agents, where each agent has its own local cost function. The agents collaborate with the server to find a minimum of the aggregate of the local cost functions. In the context of stochastic learning, the local cost of an agent is the loss function computed over the data at that agent. In this report, we consider this problem in a system wherein some of the agents may be Byzantine faulty and some of the agents may be slow (also called stragglers). In this setting, we investigate the conditions under which it is possible to obtain an "approximate" solution to the above problem. In particular, we introduce the notion of $(f, r; \epsilon)$-resilience to characterize how well the true solution is approximated in the presence of up to $f$ Byzantine faulty agents, and up to $r$ slow agents (or stragglers) -- smaller $\epsilon$ represents a better approximation. We also introduce a measure named $(f, r; \epsilon)$-redundancy to characterize the redundancy in the cost functions of the agents. Greater redundancy allows for a better approximation when solving the problem of aggregate cost minimization. In this report, we constructively show (both theoretically and empirically) that $(f, r; \mathcal{O}(\epsilon))$-resilience can indeed be achieved in practice, given that the local cost functions are sufficiently redundant.
翻译:本报告考虑了基于服务器的架构中弹性分布优化和随机学习的问题。 系统由服务器和多个代理商组成, 每个代理商都有自己的本地成本功能。 代理商与服务器合作, 以找到本地成本功能总量的最小值。 在随机学习中, 一个代理商的本地成本是该代理商数据所计算的损失函数。 在本报告中, 我们在一个系统中考虑这一问题, 其中某些代理商可能是拜占庭错误, 一些代理商可能是缓慢的( 也称为施压者 ) 。 在此设置中, 我们调查在何种条件下有可能为上述问题获得“ 近似” 解决方案。 特别是, 我们引入了$( f, r; \ epsl) 的理念, 来描述真正解决方案的准确性功能, 以美元为单位, 以美元为单位, 以美元为单位, 以美元为单位, 以美元为单位, 以美元为单位, 以美元为单位, 以美元为单位, 以美元为单位, 以美元为单位, 以美元为单位, 以美元为单位, 以美元为美元, 以美元为美元为美元, 以美元为美元为美元, 以美元, 以美元, 以美元, 以美元, 以美元为美元为美元为美元, 以美元为美元为美元为美元, 以美元