The most effective data-driven methods for human activities recognition (HAR) are based on supervised learning applied to the continuous stream of sensors data. However, these methods perform well on restricted sets of activities in domains for which there is a fully labeled dataset. It is still a challenge to cope with the intra- and inter-variability of activity execution among different subjects in large scale real world deployment. Semi-supervised learning approaches for HAR have been proposed to address the challenge of acquiring the large amount of labeled data that is necessary in realistic settings. However, their centralised architecture incurs in the scalability and privacy problems when the process involves a large number of users. Federated Learning (FL) is a promising paradigm to address these problems. However, the FL methods that have been proposed for HAR assume that the participating users can always obtain labels to train their local models. In this work, we propose FedHAR: a novel hybrid method for HAR that combines semi-supervised and federated learning. Indeed, FedHAR combines active learning and label propagation to semi-automatically annotate the local streams of unlabeled sensor data, and it relies on FL to build a global activity model in a scalable and privacy-aware fashion. FedHAR also includes a transfer learning strategy to personalize the global model on each user. We evaluated our method on two public datasets, showing that FedHAR reaches recognition rates and personalization capabilities similar to state-of-the-art FL supervised approaches. As a major advantage, FedHAR only requires a very limited number of annotated data to populate a pre-trained model and a small number of active learning questions that quickly decrease while using the system, leading to an effective and scalable solution for the data scarcity problem of HAR.
翻译:人类活动确认方面最有效的数据驱动方法(HAR)基于对感应器数据连续流应用的监管学习;然而,这些方法在有完全标签的数据集的领域中,在有限的一系列活动上效果良好;但是,在大规模实际世界部署的不同主体之间,活动执行活动在内部和内部的可变性仍是一个挑战,在大规模现实世界部署中,对活动确认(HAR)采用半监督的学习方法,目的是应对获取大量在现实环境中需要的标签数据的挑战;然而,它们的中央化结构在程序涉及大量用户时,在可缩放和隐私方面产生问题。 联邦学习(FL)是解决这些问题的一个很有希望的范例。然而,为HAR提出的FL方法假定,参与的用户总能获得用于培训其本地模型的标签。 我们提议,FHAR:一种新混合方法,将半监督前和充电的学习结合起来。事实上,FHAR将积极学习和标签传播与半自动地将本地流连接起来。 联邦学习(FRAR)主要数据的快速化方法需要一种快速的模型,同时进行一个快速的自我学习。