用被动弹性元结构对口语和被动弹性元结构进行二进制分类 (Binary classification of spoken words with passive elastic metastructures)

Tena Dubček,Daniel Moreno-Garcia,Thomas Haag,Henrik R. Thomsen,Theodor S. Becker,Christoph Bärlocher,Fredrik Andersson,Sebastian D. Huber,Dirk-Jan van Manen,Luis Guillermo Villanueva,Johan O. A. Robertsson,Marc Serra-Garcia

from arxiv, 13 pages, 9 figures

Many electronic devices spend most of their time waiting for a wake-up event: pacemakers waiting for an anomalous heartbeat, security systems on alert to detect an intruder, smartphones listening for the user to say a wake-up phrase. These devices continuously convert physical signals into electrical currents that are then analyzed on a digital computer -- leading to power consumption even when no event is taking place. Solving this problem requires the ability to passively distinguish relevant from irrelevant events (e.g. tell a wake-up phrase from a regular conversation). Here, we experimentally demonstrate an elastic metastructure, consisting of a network of coupled silicon resonators, that passively discriminates between pairs of spoken words -- solving the wake-up problem for scenarios where only two classes of events are possible. This passive speech recognition is demonstrated on a dataset from speakers with significant gender and accent diversity. The geometry of the metastructure is determined during the design process, in which the network of resonators ('mechanical neurones') learns to selectively respond to spoken words. Training is facilitated by a machine learning model that reduces the number of computationally expensive three-dimensional elastic wave simulations. By embedding event detection in the structural dynamics, mechanical neural networks thus enable novel classes of always-on smart devices with no standby power consumption.

翻译：许多电子装置花大部分时间等待警醒事件:心脏起搏器等待异常心跳,安全系统处于警戒状态以探测入侵者,智能手机听用户说警醒的话。这些装置不断将物理信号转换成电流,然后在数字计算机上分析这些电流,即使没有发生任何事件,也会导致电力消耗。解决这个问题需要能够被动地区分与不相关事件相关(例如从定期谈话中讲一个警醒词)。在这里,我们实验性地展示了一种弹性的元结构,包括一个由相互配合的硅共振反应器组成的网络,被动地区分对口语的配对 -- -- 解决仅可能发生两类事件的情景的警醒问题。这种被动的语音识别表现在具有显著性别和高度多样性的发言者的数据集上。在设计过程中确定了元结构的几何测量方法,在设计过程中,感应器网络(“机械神经”)学会有选择地响应口语。通过一个机器学习模型来帮助培训,该模型总是减少智能的固定电流动力结构,因此无法进行智能的智能智能智能结构模拟。