在建立硬性COVID-19胸前X射线分类模型中应用联邦学习联合会 (Application of Federated Learning in Building a Robust COVID-19 Chest X-ray Classification Model)

While developing artificial intelligence (AI)-based algorithms to solve problems, the amount of data plays a pivotal role - large amount of data helps the researchers and engineers to develop robust AI algorithms. In the case of building AI-based models for problems related to medical imaging, these data need to be transferred from the medical institutions where they were acquired to the organizations developing the algorithms. This movement of data involves time-consuming formalities like complying with HIPAA, GDPR, etc.There is also a risk of patients' private data getting leaked, compromising their confidentiality. One solution to these problems is using the Federated Learning framework. Federated Learning (FL) helps AI models to generalize better and create a robust AI model by using data from different sources having different distributions and data characteristics without moving all the data to a central server. In our paper, we apply the FL framework for training a deep learning model to solve a binary classification problem of predicting the presence or absence of COVID-19. We took three different sources of data and trained individual models on each source. Then we trained an FL model on the complete data and compared all the model performances. We demonstrated that the FL model performs better than the individual models. Moreover, the FL model performed at par with the model trained on all the data combined at a central server. Thus Federated Learning leads to generalized AI models without the cost of data transfer and regulatory overhead.

翻译：在开发人工智能(AI)算法以解决问题的同时,数据数量也起着关键作用——大量数据帮助研究人员和工程师开发强有力的AI算法。在建立基于AI的医学成像问题模型方面,这些数据需要从获得这些数据的医疗机构转移到制定算法的组织。这种数据流动涉及耗时的手续,如遵守HIPAA、GDPR等。还存在病人私人数据泄漏的风险,损害其保密性。这些问题的一个解决办法是使用联邦学习框架。联邦学习(FL)帮助AI模型通过使用不同来源的数据(其分布和数据特点不同,而不将所有数据转移到中央服务器),更好地推广并创建强有力的AI模型。我们用FL框架培训一个深层次学习模型,解决预测COVID-19的存在或不存在的二元分类问题。我们从三个不同的数据源和对每个来源的单个模型进行了培训。然后,我们用FL模型培训了一个完整的数据模型,并将所有模型的中央服务器的运行情况都比FL更好地进行了。我们用FL测试的单个模型演示了FL。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日