Non-volatile memory (NVM) crossbars have been identified as a promising technology, for accelerating important machine learning operations, with matrix-vector multiplication being a key example. Binary neural networks (BNNs) are especially well-suited for use with NVM crossbars due to their use of a low-bitwidth representation for both activations and weights. However, the aggressive quantization of BNNs can result in suboptimal accuracy, and the analog effects of NVM crossbars can further degrade the accuracy during inference. This paper presents a comprehensive study that benchmarks BNNs trained and validated on ImageNet and deployed on NeuroSim, a simulator for NVM-crossbar-based PIM architecture. Our study analyzes the impact of various parameters, such as input precision and ADC resolution, on both the accuracy of the inference and the hardware performance metrics. We have found that an ADC resolution of 8-bit with an input precision of 4-bit achieves near-optimal accuracy compared to the original BNNs. In addition, we have identified bottleneck components in the PIM architecture that affect area, latency, and energy consumption, and we demonstrate the impact that different BNN layers have on hardware performance.
翻译:暂无翻译