Spiking neural networks (SNNs) have made great progress on both performance and efficiency over the last few years,but their unique working pattern makes it hard to train a high-performance low-latency SNN.Thus the development of SNNs still lags behind traditional artificial neural networks (ANNs).To compensate this gap,many extraordinary works have been proposed.Nevertheless,these works are mainly based on the same kind of network structure (i.e.CNN) and their performance is worse than their ANN counterparts,which limits the applications of SNNs.To this end,we propose a novel Transformer-based SNN,termed "Spikeformer",which outperforms its ANN counterpart on both static dataset and neuromorphic dataset and may be an alternative architecture to CNN for training high-performance SNNs.First,to deal with the problem of "data hungry" and the unstable training period exhibited in the vanilla model,we design the Convolutional Tokenizer (CT) module,which improves the accuracy of the original model on DVS-Gesture by more than 16%.Besides,in order to better incorporate the attention mechanism inside Transformer and the spatio-temporal information inherent to SNN,we adopt spatio-temporal attention (STA) instead of spatial-wise or temporal-wise attention.With our proposed method,we achieve competitive or state-of-the-art (SOTA) SNN performance on DVS-CIFAR10,DVS-Gesture,and ImageNet datasets with the least simulation time steps (i.e.low latency).Remarkably,our Spikeformer outperforms other SNNs on ImageNet by a large margin (i.e.more than 5%) and even outperforms its ANN counterpart by 3.1% and 2.2% on DVS-Gesture and ImageNet respectively,indicating that Spikeformer is a promising architecture for training large-scale SNNs and may be more suitable for SNNs compared to CNN.We believe that this work shall keep the development of SNNs in step with ANNs as much as possible.Code will be available.
翻译:在过去几年里,Spiking神经网络(SNNS)在性能和效率方面取得了巨大进步。但是它们独特的工作模式使得很难训练高性能的智能智能SNN。所以SNNS的发展仍然落后于传统的人工神经网络(ANNS)。为了弥补这一差距,已经提出了许多非凡的工程。不管怎样,这些工程主要基于同样的网络结构(即CNN),它们的表现比ANNS的更差,这限制了SNNS的应用。为此,我们提议了一个全新的基于SNNS的变异性智能SNN, 以SpiketerSternetSternetSternetSterstate为制式的变异性阵列,这在静态数据集和神经变异性数据网中都比ANCNCNet级的变异性阵列。首先,解决“数据饥饿”问题和在Vanilla模型中显示的不稳定的培训时期,我们设计了变异式Tokenizer(CT)模块,这可以提高DS-VS-OIS-stal-state Studate Studate Studate Study) 模型的准确性模型的准确性模型,通过16个S-rodustryS-s-st-st-st tosideal-st