Keyword spotting (KWS) plays an essential role in enabling speech-based user interaction on smart devices, and conventional KWS (C-KWS) approaches have concentrated on detecting user-agnostic pre-defined keywords. However, in practice, most user interactions come from target users enrolled in the device which motivates to construct personalized keyword spotting. We design two personalized KWS tasks; (1) Target user Biased KWS (TB-KWS) and (2) Target user Only KWS (TO-KWS). To solve the tasks, we propose personalized keyword spotting through multi-task learning (PK-MTL) that consists of multi-task learning and task-adaptation. First, we introduce applying multi-task learning on keyword spotting and speaker verification to leverage user information to the keyword spotting system. Next, we design task-specific scoring functions to adapt to the personalized KWS tasks thoroughly. We evaluate our framework on conventional and personalized scenarios, and the results show that PK-MTL can dramatically reduce the false alarm rate, especially in various practical scenarios.
翻译:关键字定位( KWS) 在使用户在智能设备上能够以语言为基础进行用户互动方面发挥着必不可少的作用, 而传统的 KWS (C- KWS) 方法则集中在检测用户- 无法判断的预定义关键字上。 但是,在实践中,大多数用户互动来自已注册用于构建个性化关键字定位的设备的目标用户。 我们设计了两种个性化的 KWS 任务; (1) 目标用户Biased KWS (TB- KWS) 和 (2) 目标用户仅使用 KWS (TO- KWS) 。 为了解决问题, 我们建议通过多任务学习(PK- MTL) 来个性化关键字定位关键字定位( PK- MTL), 包括多任务学习和任务适应。 首先, 我们引入了在关键字定位和发言者校验中应用多任务性学习( ), 将用户信息用于关键字色系统 。 接下来, 我们设计具体任务评分数功能, 以彻底适应个人化 KWS 。 我们评估了常规和个性化情景的框架,, 结果显示 PK- MTL 可以大幅降低错误警报率,,,,, 特别是在各种实际情景中 。