Voice user interfaces and digital assistants are rapidly entering our lives and becoming singular touch points spanning our devices. These always-on services capture and transmit our audio data to powerful cloud services for further processing and subsequent actions. Our voices and raw audio signals collected through these devices contain a host of sensitive paralinguistic information that is transmitted to service providers regardless of deliberate or false triggers. As our emotional patterns and sensitive attributes like our identity, gender, well-being, are easily inferred using deep acoustic models, we encounter a new generation of privacy risks by using these services. One approach to mitigate the risk of paralinguistic-based privacy breaches is to exploit a combination of cloud-based processing with privacy-preserving, on-device paralinguistic information learning and filtering before transmitting voice data. In this paper we introduce EDGY, a configurable, lightweight, disentangled representation learning framework that transforms and filters high-dimensional voice data to identify and contain sensitive attributes at the edge prior to offloading to the cloud. We evaluate EDGY's on-device performance and explore optimization techniques, including model quantization and knowledge distillation, to enable private, accurate and efficient representation learning on resource-constrained devices. Our results show that EDGY runs in tens of milliseconds with 0.2% relative improvement in "zero-shot" ABX score or minimal performance penalties of approximately 5.95% word error rate (WER) in learning linguistic representations from raw voice signals, using a CPU and a single-core ARM processor without specialized hardware.
翻译:语音用户界面和数字助手正在迅速进入我们的生活,成为跨越我们设备的奇特的触摸点。 这些总是在服务中捕捉我们的音频数据并将其传送给强大的云层服务,以便进一步处理和随后采取行动。 通过这些设备收集的我们的声音和原始音频信号包含一系列敏感的语言信息,而不论有意或虚假的触发因素如何,传递给服务提供者。由于我们的情感模式和敏感属性,如我们的身份、性别、福祉等,很容易使用深层的声频模型推断出来,我们通过使用这些服务而面临新一代的隐私风险。一种减少基于语言的隐私侵犯风险的方法是利用基于云的处理与基于隐私的、基于理解的、隐蔽的信息学习和过滤的组合,在传输语音数据之前,在本文中,我们引入了一个可混杂、轻度、分解、分解的表述框架,以在向云层倾斜移的边缘发现和包含敏感属性。 我们评价EDGY的原始偏移性性性工作,并探索优化技术,包括不使用基于隐私的线性价比的电解和透度,在Sqral-ral-ral Stal Procial Procial Procial Produ Produal Prodududustr Prodududududududududududududududududududududustr 中,在“我们 lical lical pral pral liment liment A li limental liment liment lical liment lical limental labal liction lical press limental press press limental pressal press limental limental labal limental labal pressal pressal labal limental pressal pressal pressal pressal pressal pressal limental pressal lical lical lial lial labal pressal lical lical lical lical lical lical lical lical