Sharing medical datasets between hospitals is challenging because of the privacy-protection problem and the massive cost of transmitting and storing many high-resolution medical images. However, dataset distillation can synthesize a small dataset such that models trained on it achieve comparable performance with the original large dataset, which shows potential for solving the existing medical sharing problems. Hence, this paper proposes a novel dataset distillation-based method for medical dataset sharing. Experimental results on a COVID-19 chest X-ray image dataset show that our method can achieve high detection performance even using scarce anonymized chest X-ray images.
翻译:医院之间共享医疗数据集具有挑战性,因为隐私保护问题以及传输和储存许多高分辨率医疗图像的费用巨大,但是,数据集蒸馏可合成一个小数据集,使接受过培训的模型能够与原始大型数据集取得可比较的性能,这表明有可能解决现有的医疗共享问题,因此,本文件建议采用新的基于数据集蒸馏的方法分享医疗数据集。 COVID-19胸腔X射线图像数据集的实验结果显示,即使使用很少的匿名胸部X光图像,我们的方法也能达到高检测性能。