Deep learning (DL) has big-data processing capabilities that are as good, or even better, than those of humans in many real-world domains, but at the cost of high energy requirements that may be unsustainable in some applications and of errors, that, though infrequent, can be large. We hypothesise that a fundamental weakness of DL lies in its intrinsic dependence on integrate-and-fire point neurons that maximise information transmission irrespective of whether it is relevant in the current context or not. This leads to unnecessary neural firing and to the feedforward transmission of conflicting messages, which makes learning difficult and processing energy inefficient. Here we show how to circumvent these limitations by mimicking the capabilities of context-sensitive neocortical neurons that receive input from diverse sources as a context to amplify and attenuate the transmission of relevant and irrelevant information, respectively. Our results show that, in the case of audio-visual processing, nets composed of context-sensitive local processors can use video information as a context that guides audio signal processing towards the currently relevant information far more effectively and efficiently than current forms of DL.
翻译:深度学习(DL)具有与许多现实世界领域人类同样好甚至更好的大数据处理能力,但代价是高能源需求,在某些应用和错误中可能无法持续,尽管这种需求并不常见,但可能很大。我们假设,DL的根本弱点在于其内在依赖综合和发光点神经元,这种神经元能够最大限度地扩大信息传输,而不论在目前情况下是否相关。这导致不必要的神经发射和向前进传送相互冲突的信息,使得学习困难和处理能源效率低下。我们在这里展示了如何通过模仿环境敏感、对环境敏感、新皮质神经元的能力来规避这些限制,这些神经元接收来自不同来源的投入,作为扩大和减少相关和不相关信息的传输的背景。我们的结果表明,就视听处理而言,由对背景敏感的本地处理器组成的网络可以使用视频信息作为背景,引导音频信号处理转向当前相关信息,比当前DL形式更有效和效率更高。</s>