Voice assistants overhear conversations and a consent management mechanism is required. Consent management can be implemented using speaker recognition. Users that do not give consent enrol their voice and all their further recordings are discarded. Building speaker recognition-based consent management is challenging as dynamic registration, removal, and re-registration of speakers must be efficiently handled. This work proposes a consent management system addressing the aforementioned challenges. A contrastive based training is applied to learn the underlying speaker equivariance inductive bias. The contrastive features for buckets of speakers are trained a few steps into each iteration and act as replay buffers. These features are progressively selected using a multi-strided random sampler for classification. Moreover, new methods for dynamic registration using a portion of old utterances, removal, and re-registration of speakers are proposed. The results verify memory efficiency and dynamic capabilities of the proposed methods and outperform the existing approach from the literature.
翻译:同意管理可以通过语音识别方式实施; 不表示同意的用户登记其声音,并放弃所有进一步的记录; 建立以语音识别为基础的同意管理具有挑战性,因为必须高效地处理动态登记、删除和重新登记发言者; 这项工作提议了一种应对上述挑战的同意管理系统; 应用了以对比为基础的培训,以了解基本的发言者在感应上的偏差; 对讲者桶的反差特征进行了培训,在每个迭代中分为几步,并发挥缓冲作用; 这些特征是逐步选择的,使用多端随机取样器进行分类; 此外,还提出了使用部分老话进行动态登记、删除和重新登记发言者的新方法; 其结果验证了拟议方法的记忆效率和动态能力,并超越了文献的现有方法。