Privacy-preserving regression in machine learning is a crucial area of research, aimed at enabling the use of powerful machine learning techniques while protecting individuals' privacy. In this paper, we implement privacy-preserving regression training using data encrypted under a fully homomorphic encryption scheme. We first examine the common linear regression algorithm and propose a (simplified) fixed Hessian for linear regression training, which can be applied for any datasets even not normalized into the range $[0, 1]$. We also generalize this constant Hessian matrix to the ridge regression version, namely linear regression which includes a regularization term to penalize large coefficients. However, our main contribution is to develop a novel and efficient algorithm called LFFR for homomorphic regression using the logistic function, which could model more complex relations between input values and output prediction in comparison with linear regression. We also find a constant simplified Hessian to train our LFFR algorithm using the Newton-like method and compare it against to with our new fixed Hessian linear regression training over two real-world datasets. We suggest normalizing not only the data but also the target predictions even for the original linear regression used in a privacy-preserving manner, which is helpful to remain weights in a small range, say $[-5, +5]$ good for refreshing ciphertext setting parameters, and avoid tuning the regularization parameter $\lambda$ via cross validation. The linear regression with normalized predictions could be a viable alternative to ridge regression.
翻译:暂无翻译