Robust and reliable anonymization of chest radiographs constitutes an essential step before publishing large datasets of such for research purposes. The conventional anonymization process is carried out by obscuring personal information in the images with black boxes and removing or replacing meta-information. However, such simple measures retain biometric information in the chest radiographs, allowing patients to be re-identified by a linkage attack. Therefore, we see an urgent need to obfuscate the biometric information appearing in the images. To the best of our knowledge, we propose the first deep learning-based approach to targetedly anonymize chest radiographs while maintaining data utility for diagnostic and machine learning purposes. Our model architecture is a composition of three independent neural networks that, when collectively used, allow for learning a deformation field that is able to impede patient re-identification. The individual influence of each component is investigated with an ablation study. Quantitative results on the ChestX-ray14 dataset show a reduction of patient re-identification from 81.8% to 58.6% in the area under the receiver operating characteristic curve (AUC) with little impact on the abnormality classification performance. This indicates the ability to preserve underlying abnormality patterns while increasing patient privacy. Furthermore, we compare the proposed deep learning-based anonymization approach with differentially private image pixelization, and demonstrate the superiority of our method towards resolving the privacy-utility trade-off for chest radiographs.
翻译:常规匿名过程是通过在黑盒图像中隐匿个人信息,并删除或取代元信息来实现的。然而,这些简单措施保留了胸部射线图中的生物鉴别信息,使病人能够通过连接攻击重新识别。因此,我们认为迫切需要混淆图像中出现的生物鉴别信息。根据我们的知识,我们建议采用第一种深层次的学习方法,将胸部射线图有针对性地匿名,同时为诊断和机器学习目的保持数据效用。我们的模型结构是由三个独立的神经网络组成的,如果集体使用,可以学习能够妨碍病人重新认同的畸形领域。对每个组成部分的个人影响进行调查,通过一个断层研究。切斯特克斯射线14数据集的定量结果显示,在接收器特征曲线(AUC)下,病人的隐私重新识别率从81.8%减少到58.6%,同时保持数据的保密度,同时将这种分析能力与反常化法相比较。