This paper presents the design and implementation of WhisperWand, a comprehensive voice and motion tracking interface for voice assistants. Distinct from prior works, WhisperWand is a precise tracking interface that can co-exist with the voice interface on low sampling rate voice assistants. Taking handwriting as a specific application, it can also capture natural strokes and the individualized style of writing while occupying only a single frequency. The core technique includes an accurate acoustic ranging method called Cross Frequency Continuous Wave (CFCW) sonar, enabling voice assistants to use ultrasound as a ranging signal while using the regular microphone system of voice assistants as a receiver. We also design a new optimization algorithm that only requires a single frequency for time difference of arrival. WhisperWand prototype achieves 73 um of median error for 1D ranging and 1.4 mm of median error in 3D tracking of an acoustic beacon using the microphone array used in voice assistants. Our implementation of an in-air handwriting interface achieves 94.1% accuracy with automatic handwriting-to-text software, similar to writing on paper (96.6%). At the same time, the error rate of voice-based user authentication only increases from 6.26% to 8.28%.
翻译:本文介绍语音助理的全面语音和运动跟踪界面WhiseperWand的设计和执行。 WhiseperWand是一个精确的跟踪界面,可以与低采样速度语音助理的语音界面同时存在。以笔迹为具体应用程序,它还可以捕捉自然中风和个性化的写作风格,同时只使用一个单一频率。核心技术包括一种精确的声波测距方法,称为跨频率连续波声纳(CFW),使语音助理能够使用超声波作为测距信号,同时使用声音助理的常规麦克风系统作为接收器。我们还设计一种新的优化算法,只需要一个单一频率来计算到达的时间差。WhiseperWand原型在3D中,使用语音助理使用的麦克风阵列跟踪声信标时,在1D中位和1.4毫米中差中差中,达到73微米差。我们安装的空线对字界面的精确度达到94.1%,类似于纸面写作的自动手写软件(96.6%)。与此同时,基于语音用户认证的误率仅从6.26%增加到8.28%。