Federated learning allows a large number of devices to jointly learn a model without sharing data. In this work, we enable clients with limited computing power to perform action recognition, a computationally heavy task. We first perform model compression at the central server through knowledge distillation on a large dataset. This allows the model to learn complex features and serves as an initialization for model fine-tuning. The fine-tuning is required because the limited data present in smaller datasets is not adequate for action recognition models to learn complex spatio-temporal features. Because the clients present are often heterogeneous in their computing resources, we use an asynchronous federated optimization and we further show a convergence bound. We compare our approach to two baseline approaches: fine-tuning at the central server (no clients) and fine-tuning using (heterogeneous) clients using synchronous federated averaging. We empirically show on a testbed of heterogeneous embedded devices that we can perform action recognition with comparable accuracy to the two baselines above, while our asynchronous learning strategy reduces the training time by 40%, relative to synchronous learning.
翻译:联邦学习允许大量设备在不共享数据的情况下共同学习模型。 在这项工作中, 我们让计算机功率有限的客户能够进行动作识别, 这是一项计算性繁重的任务。 我们首先通过在大型数据集上进行知识蒸馏, 在中央服务器上进行模型压缩。 这样模型可以学习复杂的特征, 并用作模型微调的初始化。 需要微调, 因为小数据集中存在的有限数据不足以让行动识别模型学习复杂的时空特性。 由于目前客户的计算资源往往不尽相同, 我们使用非同步的联合优化, 我们进一步显示趋同的结合。 我们比较了我们的方法是两种基线方法: 在中央服务器上进行微调( 没有客户), 使用( 超异性) 平均同步的用户进行微调。 我们用实验性实验方式显示, 混合嵌入装置的测试台可以进行与上述两个基线相近的行动识别, 而我们不同步的学习战略则将培训时间减少40%, 相对同步学习而言。