Logistic regression (LR) is a widely used classification method for modeling binary outcomes in many medical data classification tasks. Research that collects and combines datasets from various data custodians and jurisdictions can excessively benefit from the increased statistical power to support their analyzing goals. However, combining data from these various sources creates significant privacy concerns that need to be addressed. In this paper, we proposed secret sharing-based privacy-preserving logistic regression protocols using the Newton-Raphson method. Our proposed approaches are based on secure Multi-Party Computation (MPC) with different security settings to analyze data owned by several data holders. We conducted experiments on both synthetic data and real-world datasets and compared the efficiency and accuracy of them with those of an ordinary logistic regression model. Experimental results demonstrate that the proposed protocols are highly efficient and accurate. This study introduces iterative algorithms to simplify the federated training a logistic regression model in a privacy-preserving manner. Our implementation results show that our improved method can handle large datasets used in securely training a logistic regression from multiple sources.
翻译:后勤回归是一种广泛使用的分类方法,用于在许多医疗数据分类任务中模拟二元结果; 收集和综合来自不同数据保管者和管辖区的数据集的研究,可能过分受益于为支持其分析目标而增加的统计力量; 然而,将来自这些不同来源的数据合并起来,会产生重大的隐私问题,需要加以解决; 在本文件中,我们提议采用牛顿-拉夫森方法,采用基于共享的秘密隐私保护后勤回归协议; 我们提出的方法基于安全性多党计算,并有不同的安全性环境,以分析若干数据持有者拥有的数据; 我们在合成数据和真实世界数据集方面进行了实验,并将这些数据与普通后勤回归模型的效率和准确性进行了比较; 实验结果表明,拟议的协议非常高效和准确; 本研究引入了迭代算法,以简化以隐私保护方式进行的联合培训后勤回归模式; 我们的执行结果表明,我们改进的方法可以处理在从多种来源可靠地培训物流回归过程中使用的大型数据集。