通过引人引导压缩,走向低长期能源效率低的深温 SNNs (Towards Low-Latency Energy-Efficient Deep SNNs via Attention-Guided Compression)

Deep spiking neural networks (SNNs) have emerged as a potential alternative to traditional deep learning frameworks, due to their promise to provide increased compute efficiency on event-driven neuromorphic hardware. However, to perform well on complex vision applications, most SNN training frameworks yield large inference latency which translates to increased spike activity and reduced energy efficiency. Hence,minimizing average spike activity while preserving accuracy indeep SNNs remains a significant challenge and opportunity.This paper presents a non-iterative SNN training technique thatachieves ultra-high compression with reduced spiking activitywhile maintaining high inference accuracy. In particular, our framework first uses the attention-maps of an un compressed meta-model to yield compressed ANNs. This step can be tuned to support both irregular and structured channel pruning to leverage computational benefits over a broad range of platforms. The framework then performs sparse-learning-based supervised SNN training using direct inputs. During the training, it jointly optimizes the SNN weight, threshold, and leak parameters to drastically minimize the number of time steps required while retaining compression. To evaluate the merits of our approach, we performed experiments with variants of VGG and ResNet, on both CIFAR-10 and CIFAR-100, and VGG16 on Tiny-ImageNet.The SNN models generated through the proposed technique yield SOTA compression ratios of up to 33.4x with no significant drops in accuracy compared to baseline unpruned counterparts. Compared to existing SNN pruning methods, we achieve up to 8.3x higher compression with improved accuracy.

翻译：深度神经网络(SNNs)已成为传统深层学习框架的一个潜在替代物,因为它们承诺提高事件驱动神经形态硬件的计算效率;然而,大多数SNN培训框架在复杂的视觉应用方面表现良好,因为大多数SNN培训框架产生大量的推推力延缓,从而导致活动激增,能源效率降低。因此,将平均激增活动最小化,同时保持准确性,从而保持对SNNS的准确性,从而成为传统深层学习框架的一个潜在替代物。本文件介绍了一种不公开的SNNN培训技术,该技术在保持高精确度的同时,会降低超高压缩活动,同时保持较高的准确度。特别是,我们的框架首先利用不压缩的元模型的注意和图示图图图图图图图图图图图图图图图图图图图图图图图图示。该框架然后利用直接投入进行少学习、监督的SNNNNNN培训。在培训期间,共同优化SNNW的重量、阈值和泄漏参数,以大幅降低需要的步伐,同时保持较高的压缩。