Capsule networks (CapsNets) are an emerging trend in image processing. In contrast to a convolutional neural network, CapsNets are not vulnerable to object deformation, as the relative spatial information of the objects is preserved across the network. However, their complexity is mainly related to the capsule structure and the dynamic routing mechanism, which makes it almost unreasonable to deploy a CapsNet, in its original form, in a resource-constrained device powered by a small microcontroller (MCU). In an era where intelligence is rapidly shifting from the cloud to the edge, this high complexity imposes serious challenges to the adoption of CapsNets at the very edge. To tackle this issue, we present an API for the execution of quantized CapsNets in Arm Cortex-M and RISC-V MCUs. Our software kernels extend the Arm CMSIS-NN and RISC-V PULP-NN to support capsule operations with 8-bit integers as operands. Along with it, we propose a framework to perform post-training quantization of a CapsNet. Results show a reduction in memory footprint of almost 75%, with accuracy loss ranging from 0.07% to 0.18%. In terms of throughput, our Arm Cortex-M API enables the execution of primary capsule and capsule layers with medium-sized kernels in just 119.94 and 90.60 milliseconds (ms), respectively (STM32H755ZIT6U, Cortex-M7 @ 480 MHz). For the GAP-8 SoC (RISC-V RV32IMCXpulp @ 170 MHz), the latency drops to 7.02 and 38.03 ms, respectively.
翻译:CapsNets在图像处理中是一个新兴趋势。 与革命性神经网络相比, CapsNet并不易受到目标变形的影响, 因为天体的相对空间信息在整个网络中被保存。 但是,它们的复杂性主要与胶囊结构和动态路由机制有关, 这使得在小型微控制器(MCU)驱动的资源限制装置中部署CapsNet几乎不合理。 在情报从云向边缘迅速转移的时代, CapsNet并不易受到目标变形的严重威胁。 为了解决这个问题,我们为在Arm Cortex-M和RISC-V MCSUBs 中实施四分立式的CaptsNet提供了AIPI, 使我们的软件库将CMSIS- NN和RISC-V PULP-N 以8位整数支持胶囊操作, 与它一道, 我们提议了一个框架, 将一级MISMS- mlC 的底值从1007- ml和MMM 级分别从707C 降为707。