Tremendous progress has been made in sequential processing with the recent advances in recurrent neural networks. However, recurrent architectures face the challenge of exploding/vanishing gradients during training, and require significant computational resources to execute back-propagation through time. Moreover, large models are typically needed for executing complex sequential tasks. To address these challenges, we propose a novel neuron model that has cosine activation with a time varying component for sequential processing. The proposed neuron provides an efficient building block for projecting sequential inputs into spectral domain, which helps to retain long-term dependencies with minimal extra model parameters and computation. A new type of recurrent network architecture, named Oscillatory Fourier Neural Network, based on the proposed neuron is presented and applied to various types of sequential tasks. We demonstrate that recurrent neural network with the proposed neuron model is mathematically equivalent to a simplified form of discrete Fourier transform applied onto periodical activation. In particular, the computationally intensive back-propagation through time in training is eliminated, leading to faster training while achieving the state of the art inference accuracy in a diverse group of sequential tasks. For instance, applying the proposed model to sentiment analysis on IMDB review dataset reaches 89.4% test accuracy within 5 epochs, accompanied by over 35x reduction in the model size compared to LSTM. The proposed novel RNN architecture is well poised for intelligent sequential processing in resource constrained hardware.
翻译:在连续处理方面已经取得了巨大的进展,因为经常神经网络最近不断出现进步。然而,经常结构在培训期间面临爆炸/加速梯度的挑战,需要大量的计算资源来进行反向分析。此外,执行复杂的连续任务通常需要大型模型。为了应对这些挑战,我们提议了一个新型神经模型,在相继处理方面有时间差异的组合启动。拟议的神经元为将连续输入光谱域提供了高效的构件,这有助于保持长期依赖性,且模型参数和计算极少。一种新型的经常性网络结构,名为Oscillatory Fourier Neural 网络,基于拟议的神经元,并应用于各种顺序任务。我们证明,与拟议的神经模型的经常性神经神经网络,在数学上相当于一种简化的离散的Fourier变换形式,适用于定期激活。特别是,在培训中,计算密集的反向反向调整,导致更快的培训,同时实现模型的精度准确度,同时实现不同类型连续处理的35级神经网络结构。在连续处理任务中,通过对序列的精确度进行拟议的硬度分析,在排序中采用拟议的硬度分析。