Precision medicine is an emerging approach for disease treatment and prevention that delivers personalized care to individual patients by considering their genetic makeups, medical histories, environments, and lifestyles. Despite the rapid advancement of precision medicine and its considerable promise, several underlying technological challenges remain unsolved. One such challenge of great importance is the security and privacy of precision health-related data, such as genomic data and electronic health records, which stifle collaboration and hamper the full potential of machine-learning (ML) algorithms. To preserve data privacy while providing ML solutions, this article makes three contributions. First, we propose a generic machine learning with encryption (MLE) framework, which we used to build an ML model that predicts cancer from one of the most recent comprehensive genomics datasets in the field. Second, our framework's prediction accuracy is slightly higher than that of the most recent studies conducted on the same dataset, yet it maintains the privacy of the patients' genomic data. Third, to facilitate the validation, reproduction, and extension of this work, we provide an open-source repository that contains the design and implementation of the framework, all the ML experiments and code, and the final predictive model deployed to a free cloud service.
翻译:精密医学是治疗和预防疾病的一种新兴方法,它通过考虑个别病人的基因成份、医疗史、环境和生活方式,为他们的基因化、医疗史、环境和生活方式提供个性化护理。尽管精密医学的迅速发展及其巨大的前景,但若干潜在的技术挑战仍未解决。其中一项极为重要的挑战就是精确健康相关数据的安全和隐私,例如基因组数据和电子健康记录,它们窒息了合作,妨碍了机学算法的充分潜力。为了在提供ML解决方案的同时保护数据隐私,本文章作出了三项贡献。第一,我们提出了一个带有加密(MLE)框架的通用机器学习。我们用这个框架来建立一个ML模型,从最近一个全面的基因组数据集中预测癌症。第二,我们的框架预测准确性略高于最近对同一数据集进行的研究的准确性,但它维护了病人基因组数据的隐私。第三,为了便利鉴定、复制和扩展这项工作,我们提供了一个公开源库,其中包含框架的设计和实施,所有MLL实验和代码都用于一个自由的云层试验和代码。