Vision Transformers (ViTs) have established new performance benchmarks in vision tasks such as image recognition and object detection. However, these advancements come with significant demands for memory and computational resources, presenting challenges for hardware deployment. Heterogeneous compute-in-memory (CIM) accelerators have emerged as a promising solution for enabling energy-efficient deployment of ViTs. Despite this potential, monolithic CIM-based designs face scalability issues due to the size limitations of a single chip. To address this challenge, emerging chiplet-based techniques offer a more scalable alternative. However, chiplet designs come with their own costs, as they introduce more expensive communication through the network-on-package (NoP) compared to the network-on-chip (NoC), which can hinder improvements in throughput. This work introduces Hemlet, a heterogeneous CIM chiplet system designed to accelerate ViT. Hemlet facilitates flexible resource scaling through the integration of heterogeneous analog CIM (ACIM), digital CIM (DCIM), and Intermediate Data Process (IDP) chiplets. To improve throughput while reducing communication ove
翻译:视觉Transformer(ViT)已在图像识别和物体检测等视觉任务中确立了新的性能基准。然而,这些进展伴随着对内存和计算资源的巨大需求,给硬件部署带来了挑战。异构存内计算(CIM)加速器已成为实现ViT能效部署的一种有前景的解决方案。尽管具有潜力,但基于单片CIM的设计因单芯片尺寸限制而面临可扩展性问题。为应对这一挑战,新兴的小芯片技术提供了一种更具可扩展性的替代方案。然而,小芯片设计本身也存在成本问题,因为与片上网络(NoC)相比,它们通过封装内网络(NoP)引入了更昂贵的通信,这可能阻碍吞吐量的提升。本研究提出了Hemlet,一种专为加速ViT设计的异构CIM小芯片系统。Hemlet通过集成异构模拟CIM(ACIM)、数字CIM(DCIM)和中间数据处理(IDP)小芯片,实现了灵活的资源扩展。该系统旨在提升吞吐量,同时减少通信开销。