In the medical field, multi-center collaborations are often sought to yield more generalizable findings by leveraging the heterogeneity of patient and clinical data. However, recent privacy regulations hinder the possibility to share data, and consequently, to come up with machine learning-based solutions that support diagnosis and prognosis. Federated learning (FL) aims at sidestepping this limitation by bringing AI-based solutions to data owners and only sharing local AI models, or parts thereof, that need then to be aggregated. However, most of the existing federated learning solutions are still at their infancy and show several shortcomings, from the lack of a reliable and effective aggregation scheme able to retain the knowledge learned locally to weak privacy preservation as real data may be reconstructed from model updates. Furthermore, the majority of these approaches, especially those dealing with medical data, relies on a centralized distributed learning strategy that poses robustness, scalability and trust issues. In this paper we present a decentralized distributed method that, exploiting concepts from experience replay and generative adversarial research, effectively integrates features from local nodes, providing models able to generalize across multiple datasets while maintaining privacy. The proposed approach is tested on two tasks - tuberculosis and melanoma classification - using multiple datasets in order to simulate realistic non-i.i.d. data scenarios. Results show that our approach achieves performance comparable to both standard (non-federated) learning and federated methods in their centralized (thus, more favourable) formulation.
翻译:在医疗领域,往往寻求多中心协作,利用病人和临床数据的异质性,从而得出更普遍适用的结论;然而,最近的隐私条例妨碍了分享数据的可能性,因此也妨碍了采用支持诊断和预测的基于机械学习的解决方案; 联邦学习(FL)的目的是通过将基于AI的解决方案带给数据所有者,使数据所有者,并只分享当地需要汇总的AI模型或部分AI模型,从而绕过这一局限性; 然而,大多数现有的联合学习解决方案仍然处于初创阶段,并显示出若干缺点,因为缺乏可靠和有效的汇总机制,能够保留在当地学到的知识,而隐私保护薄弱,因为可以通过模型更新来重建真实数据。此外,这些方法中的大多数,特别是处理医疗数据的方法,都依赖于集中分布的学习战略,这种战略具有稳健性、可缩放性和信任性。 在本文中,我们采用了分散分布的方法,利用经验重现和基因化的对抗性对口研究的概念,有效地整合了地方节点的特征,提供了能够将多种数据组合的模型加以概括化,同时保持多种可比性的数据保存,同时保持保密性。 拟议的数据排序方法在模拟数据分类中,用于非核心性的数据顺序。