Binary Neural Networks (BNNs) are increasingly preferred over full-precision Convolutional Neural Networks(CNNs) to reduce the memory and computational requirements of inference processing with minimal accuracy drop. BNNs convert CNN model parameters to 1-bit precision, allowing inference of BNNs to be processed with simple XNOR and bitcount operations. This makes BNNs amenable to hardware acceleration. Several photonic integrated circuits (PICs) based BNN accelerators have been proposed. Although these accelerators provide remarkably higher throughput and energy efficiency than their electronic counterparts, the utilized XNOR and bitcount circuits in these accelerators need to be further enhanced to improve their area, energy efficiency, and throughput. This paper aims to fulfill this need. For that, we invent a single-MRR-based optical XNOR gate (OXG). Moreover, we present a novel design of bitcount circuit which we refer to as Photo-Charge Accumulator (PCA). We employ multiple OXGs in a cascaded manner using dense wavelength division multiplexing (DWDM) and connect them to the PCA, to forge a novel Optical XNOR-Bitcount based Binary Neural Network Accelerator (OXBNN). Our evaluation for the inference of four modern BNNs indicates that OXBNN provides improvements of up to 62x and 7.6x in frames-per-second (FPS) and FPS/W (energy efficiency), respectively, on geometric mean over two PIC-based BNN accelerators from prior work. We developed a transaction-level, event-driven python-based simulator for evaluation of accelerators (https://github.com/uky-UCAT/B_ONN_SIM).
翻译:二进制神经网络(BNNs)越来越受欢迎,因为它们可以将卷积神经网络(CNNs)的内存和计算要求降低到最小,同时最小化准确性下降。BNN将CNN模型参数转换为1位精度,使得可以使用简单的XNOR和位计数操作处理BNN的推理。这使得BNN易于进行硬件加速。已经提出了几种基于光子集成电路(PICs)的BNN加速器。虽然这些加速器比其电子对应物提供了显着更高的吞吐量和能量效率,但这些加速器中使用的XNOR和位计数电路仍需要进一步改进,以提高其面积,能量效率和吞吐量。本文旨在满足这一需求。为此,我们发明了一种基于单个微环谐振器(MRR)的光学XNOR门(OXG)。此外,我们提出了一种新型的位计数电路设计,称为Photo-Charge Accumulator(PCA)。我们使用密集波长分割复用(DWDM)以级联方式使用多个OXG,并将它们连接到PCA,以打造一种新型的基于光学的XNOR-Bitcount二进制神经网络加速器(OXBNN)。我们的四种现代BNN的推理评估表明,OXBNN在几何平均值上比先前工作中两个PIC-based BNN加速器提供了最高达62倍和7.6倍的帧率(FPS)和FPS / W(能效)改进。我们为加速器评估开发了一个基于事务级,事件驱动的Python模拟器(https://github.com/uky-UCAT/B_ONN_SIM)。