Generalizing federated learning (FL) models to unseen clients with non-iid data is a crucial topic, yet unsolved so far. In this work, we propose to tackle this problem from a novel causal perspective. Specifically, we form a training structural causal model (SCM) to explain the challenges of model generalization in a distributed learning paradigm. Based on this, we present a simple yet effective method using test-specific and momentum tracked batch normalization (TsmoBN) to generalize FL models to testing clients. We give a causal analysis by formulating another testing SCM and demonstrate that the key factor in TsmoBN is the test-specific statistics (i.e., mean and variance) of features. Such statistics can be seen as a surrogate variable for causal intervention. In addition, by considering generalization bounds in FL, we show that our TsmoBN method can reduce divergence between training and testing feature distributions, which achieves a lower generalization gap than standard model testing. Our extensive experimental evaluations demonstrate significant improvements for unseen client generalization on three datasets with various types of feature distributions and numbers of clients. It is worth noting that our proposed approach can be flexibly applied to different state-of-the-art federated learning algorithms and is orthogonal to existing domain generalization methods.
翻译:在这项工作中,我们提议从新的因果角度来解决这一问题。具体地说,我们形成了一个培训结构性因果模型(SCM),以解释在分布式学习范式中典型化的挑战。在此基础上,我们提出了一个简单而有效的方法,使用具体测试和跟踪动力的批量正常化(TsmoBN),将FL模型普遍化(TsmoBN),以测试客户。我们通过制定另一个测试SCM(SCM)进行因果关系分析,并表明TsmoBN(TsmoBN)中的关键因素是特征的测试性统计(即,平均值和差异)。这种统计可以被视为一种因果干预的替代变量。此外,通过考虑FL(FL)中的概括化界限,我们表明我们的TsmoBN方法可以减少培训和测试特征分布之间的差异,这种差异比标准模式测试要小。我们的广泛实验性评估表明,在三种具有不同类型特征分布和客户域内的最新数据集化(即数字)方面,我们提出的创新的学习方法可以适用于现行。