具有常量通信复杂性的异种数据 (Federated Deep AUC Maximization for Heterogeneous Data with a Constant Communication Complexity)

from arxiv, Accepted by ICML2021. Code is available in https://github.com/Optimization-AI/ICML2021_FedDeepAUC_CODASCA, which is a part of our open-sourced library LibAUC (www.libauc.org)

Deep AUC (area under the ROC curve) Maximization (DAM) has attracted much attention recently due to its great potential for imbalanced data classification. However, the research on Federated Deep AUC Maximization (FDAM) is still limited. Compared with standard federated learning (FL) approaches that focus on decomposable minimization objectives, FDAM is more complicated due to its minimization objective is non-decomposable over individual examples. In this paper, we propose improved FDAM algorithms for heterogeneous data by solving the popular non-convex strongly-concave min-max formulation of DAM in a distributed fashion, which can also be applied to a class of non-convex strongly-concave min-max problems. A striking result of this paper is that the communication complexity of the proposed algorithm is a constant independent of the number of machines and also independent of the accuracy level, which improves an existing result by orders of magnitude. The experiments have demonstrated the effectiveness of our FDAM algorithm on benchmark datasets, and on medical chest X-ray images from different organizations. Our experiment shows that the performance of FDAM using data from multiple hospitals can improve the AUC score on testing data from a single hospital for detecting life-threatening diseases based on chest radiographs. The proposed method is implemented in our open-sourced library LibAUC (www.libauc.org) whose github address is https://github.com/Optimization-AI/ICML2021_FedDeepAUC_CODASCA.

翻译：深AUC(ROC曲线下的区域) 最大化(DAM) 因其极有可能造成数据分类不平衡,最近引起了人们的极大关注。然而,关于Federal Deep AUC 最大化(FDAM)的研究仍然有限。与侧重于可分解的最小化目标的标准联合学习(FL)方法相比,FDM由于其最小化目标而变得更加复杂,与个别例子相比,它的最小化目标是不可分的。在本文中,我们建议改进FDAM的变量,通过以分布式方式解决流行的非Convex 强凝固的DAM 最小成型。DAM的配置也可以适用于非cive Comepal CAUC(FAUC 强凝固硬化硬化硬化软缩缩缩缩缩缩微缩图 ) 。与本文的一个惊人的结果是,拟议的计算法的通信复杂性与机器数量无关,也与准确性水平无关。我们FDAMMLA-LIRC数据库在基准数据集和不同组织的医学X光图像上的效率。我们的实验显示,ALIRC-RO-IA-C ASUM的成绩测试方法用于测试多种医院的数据。AUD-ILILUDA-C。AD-ILM的测试方法可以改进了A-I-IB-ID-ILUD-IB-IAS-I-C在A-IAS-IAS-IAS-IAS-IAS-IAS-IAS-I-C。