Despite the recent success of machine learning algorithms, most models still face several drawbacks when considering more complex tasks requiring interaction between different sources, such as multimodal input data and logical time sequence. On the other hand, the biological brain is highly sharpened in this sense, empowered to automatically manage and integrate such a stream of information through millions of years of evolution. In this context, this paper finds inspiration on recent discoveries on cortical circuits on the brain to propose a more biologically plausible self-supervised machine learning approach that combines multimodal information using intra-layer modulations together with Canonical Correlation Analysis, and a memory mechanism to keep track of temporal data, the so-called Canonical Cortical Graph Neural networks. The approach outperformed recent state-of-the-art results considering clean audio reconstruction and energy efficiency, described by a reduced and smother neuron firing rate distribution, suggesting the model as a suitable approach for speech enhancement in audio-visual hearing aid devices.
翻译:尽管机器学习算法最近取得了成功,但大多数模型在考虑要求不同来源之间互动的复杂任务时仍面临若干缺点,例如多式联运输入数据和逻辑时间序列。另一方面,生物大脑在这种意义上高度精锐,能够通过数百万年的演进自动管理和整合这种信息流。在这方面,本文件从大脑皮质电路的最新发现中找到灵感,可以提出一种更具有生物价值的自我监督的机器学习方法,将多式信息与使用内部调制器的Canonical相交分析相结合,以及一个跟踪时间数据的记忆机制,即所谓的Canonic Cortic Cortical图神经网络。 这种方法在考虑清洁的音频重建和能源效率方面,超越了最新的最新最新的最新成果,因为通过减少和窒息神经发射率分布,提出了一种模式,作为视听助听器语音增强语音的合适方法。