Federated medical relation extraction enables multiple clients to train a deep network collaboratively without sharing their raw medical data. In order to handle the heterogeneous label distribution across clients, most of the existing works only involve enforcing regularization between local and global models during optimization. In this paper, we fully utilize the models of all clients and propose a novel concept of \textit{major classifier vectors}, where a group of class vectors is obtained in an ensemble rather than the weighted average method on the server. The major classifier vectors are then distributed to all clients and the local training of each client is Contrasted with Major Classifier vectors (FedCMC), so the local model is not prone to overfitting to the local label distribution. FedCMC requires only a small amount of additional transfer of classifier parameters without any leakage of raw data, extracted representations, and label distributions. Our extensive experiments show that FedCMC outperforms the other state-of-the-art FL algorithms on three medical relation extraction datasets.
翻译:联邦医学关系提取使多个客户能够在不分享原始医疗数据的情况下合作培训深网络。 为了处理客户之间的不同标签分布,大多数现有工作只涉及优化期间当地和全球模式之间的正规化。 在本文中,我们充分利用了所有客户的模式,并提出了一个新型的\textit{majal分类矢量概念,即一组类矢量以组合方式获得,而不是以服务器加权平均方法获得。然后,主要分类矢量分发给所有客户,每个客户的本地培训与主要分类矢量(FedCMC)相悖,因此,本地模型不易与本地标签分布过度匹配。 FedCMC只需要少量额外转换分类参数,而不泄漏原始数据、提取的演示和标签分布。我们的广泛实验显示,FCCMC在三个医疗关系提取数据集上超越了其他最先进的FL算法。