This paper studies how to develop accurate and interpretable time series classification (TSC) models with the help of external data in a privacy-preserving federated learning (FL) scenario. To the best of our knowledge, we are the first to study on this essential topic. Achieving this goal requires us to seamlessly integrate the techniques from multiple fields including Data Mining, Machine Learning, and Security. In this paper, we formulate the problem and identify the interpretability constraints under the FL setting. We systematically investigate existing TSC solutions for the centralized scenario and propose FedST, a novel FL-enabled TSC framework based on a shapelet transformation method. We recognize the federated shapelet search step as the kernel of FedST. Thus, we design FedSS-B, a basic protocol for the FedST kernel that we prove to be secure and accurate. Further, we identify the efficiency bottlenecks of the basic protocol and propose optimizations tailored for the FL setting for acceleration. Our theoretical analysis shows that the proposed optimizations are secure and more efficient. We conduct extensive experiments using both synthetic and real-world datasets. Empirical results show that our FedST solution is effective in terms of TSC accuracy, and the proposed optimizations can achieve three orders of magnitude of speedup.
翻译:本文研究如何在保护隐私的联邦学习(FL)假设情景中,利用外部数据开发准确和可解释的时间序列分类模型。我们最了解的是,我们首先研究这个基本主题。实现这一目标要求我们无缝地整合包括数据开采、机器学习和安全在内的多个领域的技术。在本文件中,我们提出问题,并查明FL设置下的解释性限制。我们系统调查中央情景的现有TSC解决方案,并提议FedST,一个基于形状转换方法的FL驱动的新FST框架。我们认识到Federate 形状板块搜索步骤是FedST的核心。因此,我们设计FedSS-B,这是FedST核心的基本协议,我们证明它是安全和准确的。此外,我们确定基本协议的效率瓶颈,提出适合FL设置加速的优化。我们的理论分析表明,拟议的优化是安全的,效率更高。我们利用合成和现实世界数据集进行广泛的实验。我们使用FedSS的组合和现实世界数据集,最精确性地显示,我们FedST解决方案的精确度可以达到拟议的三级。