分散的联邦学习保护网模式和数据隐私 (Decentralized Federated Learning Preserves Model and Data Privacy)

The increasing complexity of IT systems requires solutions, that support operations in case of failure. Therefore, Artificial Intelligence for System Operations (AIOps) is a field of research that is becoming increasingly focused, both in academia and industry. One of the major issues of this area is the lack of access to adequately labeled data, which is majorly due to legal protection regulations or industrial confidentiality. Methods to mitigate this stir from the area of federated learning, whereby no direct access to training data is required. Original approaches utilize a central instance to perform the model synchronization by periodical aggregation of all model parameters. However, there are many scenarios where trained models cannot be published since its either confidential knowledge or training data could be reconstructed from them. Furthermore the central instance needs to be trusted and is a single point of failure. As a solution, we propose a fully decentralized approach, which allows to share knowledge between trained models. Neither original training data nor model parameters need to be transmitted. The concept relies on teacher and student roles that are assigned to the models, whereby students are trained on the output of their teachers via synthetically generated input data. We conduct a case study on log anomaly detection. The results show that an untrained student model, trained on the teachers output reaches comparable F1-scores as the teacher. In addition, we demonstrate that our method allows the synchronization of several models trained on different distinct training data subsets.

翻译：信息技术系统日益复杂,这就要求有解决办法,在出现失败时支持操作。因此,系统操作人工情报(AIOps)是一个研究领域,越来越集中在学术界和工业界。该领域的一个主要问题是缺乏充分贴标签的数据,这主要是由于法律保护条例或工业保密。从联邦学习领域缓解这种潮流的方法,即不需要直接获得培训数据。原始方法利用一个中心实例,通过定期汇总所有模型参数来进行模型同步。然而,有许多经过训练的模型无法公布,因为可以从这些模型中重建其机密知识或培训数据。此外,中心实例需要信任,并且是一个单一的失败点。作为一个解决办法,我们建议一种完全分散的办法,允许在经过训练的模型之间分享知识。不需要直接获取培训数据,也不需要传输模型的原始培训数据和模型参数。概念依赖于教师和学生的作用,通过合成生成的数据对学生进行关于教师产出的培训。我们进行了关于日志异常异常异常的检测的案例研究。结果显示,我们经过训练的教师们能够以不同的模型的形式进行不同的分析。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/