Federated learning (FL) is one of the most important paradigms addressing privacy and data governance issues in machine learning (ML). Adversarial training has emerged, so far, as the most promising approach against evasion threats on ML models. In this paper, we take the first known steps towards federated adversarial training (FAT) combining both methods to reduce the threat of evasion during inference while preserving the data privacy during training. We investigate the effectiveness of the FAT protocol for idealised federated settings using MNIST, Fashion-MNIST, and CIFAR10, and provide first insights on stabilising the training on the LEAF benchmark dataset which specifically emulates a federated learning environment. We identify challenges with this natural extension of adversarial training with regards to achieved adversarial robustness and further examine the idealised settings in the presence of clients undermining model convergence. We find that Trimmed Mean and Bulyan defences can be compromised and we were able to subvert Krum with a novel distillation based attack which presents an apparently "robust" model to the defender while in fact the model fails to provide robustness against simple attack modifications.
翻译:联邦学习(FL)是处理机器学习(ML)中的隐私和数据治理问题的最重要范例之一。 反向培训是迄今出现的最有希望的防止对ML模式的规避威胁的方法。在本文件中,我们采取已知的最初步骤,即联合对抗培训(FAT)结合两种方法,以减少推断期间的规避威胁,同时在培训期间保护数据隐私。我们调查了FAT协议对理想化的联邦化环境的有效性,使用了MNIST、Fashon-MNIST和CIFAR10, 提供了关于稳定LEAF基准数据集培训的初步见解,该基准数据集特别仿效了一个联邦化学习环境。我们确定对抗性培训的这一自然扩展在达到对抗性强健方面所面临的挑战,并进一步研究客户破坏模式趋同时的理想环境。我们发现,Trimmmed mine和Bulyan防御可能受到危害,我们得以用一种新颖的蒸馏模型推翻Krum,这给捍卫者提供了一种明显的“robust”模型,而事实上该模型无法提供抵御简单的攻击的强力。