We trained a keyword spotting model using federated learning on real user devices and observed significant improvements when the model was deployed for inference on phones. To compensate for data domains that are missing from on-device training caches, we employed joint federated-centralized training. And to learn in the absence of curated labels on-device, we formulated a confidence filtering strategy based on user-feedback signals for federated distillation. These techniques created models that significantly improved quality metrics in offline evaluations and user-experience metrics in live A/B experiments.
翻译:我们训练了一个关键词识别模型,使用联合学习的实际用户设备,并观察到该模型用于电话推断时有了重大改进。为了弥补设备培训缓存中缺失的数据领域,我们采用了联合联合集中化培训。为了在设计设备上没有标定的标签的情况下学习,我们根据用户反馈信号制定了一种信任过滤战略,用于联合蒸馏。这些技术创造了一些模型,大大改进了离线评价的质量指标和A/B现场试验中的用户经验衡量标准。