Recent years have witnessed the outstanding success of deep learning in various fields such as vision and natural language processing. This success is largely indebted to the massive size of deep learning models that is expected to increase unceasingly. This growth of the deep learning models is accompanied by issues related to their considerable energy consumption, both during the training and inference phases, as well as their scalability. Although a number of work based on unconventional physical systems have been proposed which addresses the issue of energy efficiency in the inference phase, efficient training of deep learning models has remained unaddressed. So far, training of digital deep learning models mainly relies on backpropagation, which is not suitable for physical implementation as it requires perfect knowledge of the computation performed in the so-called forward pass of the neural network. Here, we tackle this issue by proposing a simple deep neural network architecture augmented by a biologically plausible learning algorithm, referred to as "model-free forward-forward training". The proposed architecture enables training deep physical neural networks consisting of layers of physical nonlinear systems, without requiring detailed knowledge of the nonlinear physical layers' properties. We show that our method outperforms state-of-the-art hardware-aware training methods by improving training speed, decreasing digital computations, and reducing power consumption in physical systems. We demonstrate the adaptability of the proposed method, even in systems exposed to dynamic or unpredictable external perturbations. To showcase the universality of our approach, we train diverse wave-based physical neural networks that vary in the underlying wave phenomenon and the type of non-linearity they use, to perform vowel and image classification tasks experimentally.
翻译:近年来,深度学习在视觉和自然语言处理等领域取得了卓越的成功,这主要得益于深度学习模型的巨大规模,预计这种趋势将继续增长。然而,深度学习模型的增长伴随着与其可扩展性和能耗相关的问题。尽管有一些基于非传统的物理系统的工作已经提出,以解决推断阶段能效的问题,但深度学习模型的高效训练仍未解决。到目前为止,数字深度学习模型的训练主要依赖于反向传播,但这种方法并不适用于物理实现,因为它需要对前向传递的神经网络计算过程有完美的了解。在这里,我们通过提出一种简单的深度神经网络体系结构和一种生物可信的学习算法来解决这个问题,称为“基于前向传播的无模型训练”。所提出的结构可以训练由物理非线性系统层构成的深度物理神经网络,而不需要详细了解非线性物理层的特性。我们表明,我们的方法通过提高训练速度、减少数字计算和降低物理系统的能耗,在性能上优于最先进的硬件感知训练方法。我们证明了所提出的方法的适应性,即使在暴露于动态或不可预测的外部扰动的系统中也是如此。为了展示我们方法的通用性,我们训练了各种基于波的物理神经网络,在底层的波现象和使用的非线性类型方面不同,以执行元音和图像分类任务的实验。