We introduce FELICIA (FEderated LearnIng with a CentralIzed Adversary) a generative mechanism enabling collaborative learning. In particular, we show how a data owner with limited and biased data could benefit from other data owners while keeping data from all the sources private. This is a common scenario in medical image analysis where privacy legislation prevents data from being shared outside local premises. FELICIA works for a large family of Generative Adversarial Networks (GAN) architectures including vanilla and conditional GANs as demonstrated in this work. We show that by using the FELICIA mechanism, a data owner with limited image samples can generate high-quality synthetic images with high utility while neither data owners has to provide access to its data. The sharing happens solely through a central discriminator that has access limited to synthetic data. Here, utility is defined as classification performance on a real test set. We demonstrate these benefits on several realistic healthcare scenarions using benchmark image datasets (MNIST, CIFAR-10) as well as on medical images for the task of skin lesion classification. With multiple experiments, we show that even in the worst cases, combining FELICIA with real data gracefully achieves performance on par with real data while most results significantly improves the utility.
翻译:我们引入了一个有利于合作学习的基因化机制(FELCIA ), 特别是, 我们展示了拥有有限和偏差数据的数据的所有者如何能从其他数据拥有者获益,同时保持所有来源的私有数据。这是医学图像分析中常见的一种情景,即隐私立法阻止数据在当地房地外共享。 FELIIA 是为一个大型的基因辅助网络(GAN ) 结构大家庭工作的,包括香草和有条件的GAN, 正如这项工作所显示的那样。我们通过使用FELIIA 机制,一个拥有有限图像样本的数据所有者能够产生高品质的合成图像,而没有任何数据所有者能够提供数据。共享只能通过一个只有有限使用合成数据的中央歧视者进行。在这里,实用性被定义为真实测试集的分类性表现。我们用基准图像集(MNIST, CIFAR-10) 以及用于皮肤损害分类的医疗图像展示了这些好处。我们通过多次实验,显示即使在最差的案例中,我们也能显著地改进了实际效用数据,同时将FICIA 与真实性数据结合。