Machine Learning on Big Data gets more and more attention in various fields. Even so privacy-preserving techniques become more important, even necessary due to legal regulations such as the General Data Protection Regulation (GDPR). On the other hand data is often distributed among various parties. Especially in the medical context there are several data holders, e.g. hospitals and we need to deal with highly sensitive values. A real world scenario would be data that is held in an electronic patient record that is available in many countries by now. The medical data is encrypted. Users (e.g. physicians, hospitals) can only decrypt the data after patient authorization. One of the main questions concerning this scenario is whether it is possible to process the data for research purposes without violating the privacy of the data owner. We want to evaluate which cryptographic mechanism - homomorphic encryption, multiparty computation or trusted execution environements - can be used for this task.
翻译:“大数据”上的机器学习在各个领域越来越受到越来越多的关注。即使如此,隐私保护技术也变得更加重要,甚至由于一般数据保护条例(GDPR)等法律条例而有必要。另一方面,数据往往在各方之间分布。特别是在医疗方面,有好几个数据持有者,例如医院,我们需要处理高度敏感的值。现实世界情景是,数据保存在目前许多国家现有的电子病人记录中。医疗数据是加密的。用户(例如医生、医院)只能在病人授权后才能解密数据。关于这种情况的一个主要问题是,是否有可能在不侵犯数据拥有者的隐私的情况下为研究目的处理数据。我们要评估什么加密机制――同式加密、多式计算或可信赖的处决环境――可用于这项任务。