Waves, such as light and sound, inherently bounce and mix due to multiple scattering induced by the complex material objects that surround us. This scattering process severely scrambles the information carried by waves, challenging conventional communication systems, sensing paradigms, and wave-based computing schemes. Here, we show that instead of being a hindrance, multiple scattering can be beneficial to enable and enhance analog nonlinear information mapping, allowing for the direct physical implementation of computational paradigms such as reservoir computing and extreme learning machines. We propose a physics-inspired version of such computational architectures for speech and vowel recognition that operate directly in the native domain of the input signal, namely on real-sounds, without any digital pre-processing or encoding conversion and backpropagation training computation. We first implement it in a proof-of-concept prototype, a nonlinear chaotic acoustic cavity containing multiple tunable and power-efficient nonlinear meta-scatterers. We prove the efficiency of the acoustic-based computing system for vowel recognition tasks with high testing classification accuracy (91.4%). Finally, we demonstrate the high performance of vowel recognition in the natural environment of a reverberation room. Our results open the way for efficient acoustic learning machines that operate directly on the input sound, and leverage physics to enable Natural Language Processing (NLP).
翻译:波,例如光和声音,由于周围的复杂物体所引起的多重散射而固有地反弹和混合。这种散射过程严重打乱了波所承载的信息,挑战了传统的通信系统、感知范例和基于波的计算方案。在这里,我们展示了多重散射可以具有益处,可以实现和增强模拟非线性信息映射,从而使计算范式如水库计算和极端学习机得以直接物理实现。我们提出了这种计算体系结构的物理启发版本,用于语音和元音识别,直接在输入信号的本机域,即实际声音上操作,没有任何数字预处理或编码转换和反向传播训练计算。我们首先在一个概念证明原型中实现它,即一个非线性混沌声学腔,其中包含多个可调谐且功率高效的非线性元散射体。我们证明了在元音识别任务中声学计算系统的有效性,测试分类准确性高达91.4%。最后,我们演示了在混响室的自然环境中进行元音识别的高性能。我们的结果为直接在输入声音上操作的高效声学学习机铺平了道路,并利用物理学实现了自然语言处理(NLP)。