In a large-scale distributed machine learning system, coded computing has attracted wide-spread attention since it can effectively alleviate the impact of stragglers. However, several emerging problems greatly limit the performance of coded distributed systems. Firstly, an existence of colluding workers who collude results with each other leads to serious privacy leakage issues. Secondly, there are few existing works considering security issues in data transmission of distributed computing systems. Thirdly, the number of required results for which need to wait increases with the degree of decoding functions. In this paper, we design a secure and private approximated coded distributed computing (SPACDC) scheme that deals with the above-mentioned problems simultaneously. Our SPACDC scheme guarantees data security during the transmission process using a new encryption algorithm based on elliptic curve cryptography. Especially, the SPACDC scheme does not impose strict constraints on the minimum number of results required to be waited for. An extensive performance analysis is conducted to demonstrate the effectiveness of our SPACDC scheme. Furthermore, we present a secure and private distributed learning algorithm based on the SPACDC scheme, which can provide information-theoretic privacy protection for training data. Our experiments show that the SPACDC-based deep learning algorithm achieves a significant speedup over the baseline approaches.
翻译:暂无翻译