Spiking Neural Networks (SNNs) is a practical approach toward more data-efficient deep learning by simulating neurons leverage on temporal information. In this paper, we propose the Temporal-Channel Joint Attention (TCJA) architectural unit, an efficient SNN technique that depends on attention mechanisms, by effectively enforcing the relevance of spike sequence along both spatial and temporal dimensions. Our essential technical contribution lies on: 1) compressing the spike stream into an average matrix by employing the squeeze operation, then using two local attention mechanisms with an efficient 1-D convolution to establish temporal-wise and channel-wise relations for feature extraction in a flexible fashion. 2) utilizing the Cross Convolutional Fusion (CCF) layer for modeling inter-dependencies between temporal and channel scope, which breaks the independence of the two dimensions and realizes the interaction between features. By virtue of jointly exploring and recalibrating data stream, our method outperforms the state-of-the-art (SOTA) by up to 15.7% in terms of top-1 classification accuracy on all tested mainstream static and neuromorphic datasets, including Fashion-MNIST, CIFAR10-DVS, N-Caltech 101, and DVS128 Gesture.
翻译:Spik NealNetworks(SNN)是一种实用的方法,通过模拟神经神经神经元在时间信息上的杠杆作用来提高数据效率的深层次学习。在本文中,我们建议采用Tempal-Channe 联合注意(TCJA)建筑单元,这是一个高效的SNN技术,取决于关注机制,在空间和时空两个方面有效地执行峰值序列的相关性。我们的重要技术贡献在于:1)通过使用挤压操作将尖峰流压缩成一个平均矩阵,然后利用两个具有1D效率的当地关注机制,以灵活的方式为特征提取建立时间和频道关系。 2)利用Cross Convolutional Commusion(CC)层来模拟时间和频道范围之间的相互依存关系,这打破了两个层面的独立性,并实现了各特征之间的相互作用。通过共同探索和重新校正数据流,我们的方法超越了最新技术(SOVTA), 在所有测试的主流静态和神经变形数据集(包括Fash-GARS-D-101、C-DFARS-DS-D10、C-DIS-DIS-DIS-DIS10和GIS-DIS-DIS-DIS-DIS-DIS-DIS-DAR10和DIS1-DIS-DIS-DIS-DIS-DIS-DIS-DIS1)上最高分类精确度达157%)中,比了15.7。