High-performance deep neural network (DNN)-based systems are in high demand in edge environments. Due to its high computational complexity, it is challenging to deploy DNNs on edge devices with strict limitations on computational resources. In this paper, we derive a compact while highly-accurate DNN model, termed dsODENet, by combining recently-proposed parameter reduction techniques: Neural ODE (Ordinary Differential Equation) and DSC (Depthwise Separable Convolution). Neural ODE exploits a similarity between ResNet and ODE, and shares most of weight parameters among multiple layers, which greatly reduces the memory consumption. We apply dsODENet to a domain adaptation as a practical use case with image classification datasets. We also propose a resource-efficient FPGA-based design for dsODENet, where all the parameters and feature maps except for pre- and post-processing layers can be mapped onto on-chip memories. It is implemented on Xilinx ZCU104 board and evaluated in terms of domain adaptation accuracy, inference speed, FPGA resource utilization, and speedup rate compared to a software counterpart. The results demonstrate that dsODENet achieves comparable or slightly better domain adaptation accuracy compared to our baseline Neural ODE implementation, while the total parameter size without pre- and post-processing layers is reduced by 54.2% to 79.8%. Our FPGA implementation accelerates the inference speed by 23.8 times.
翻译:以高性能深度神经网络为基础的系统在边缘环境中需求很高。 由于其计算复杂程度很高, 很难在边端设备上部署DNN, 且对计算资源有严格的限制。 在本文中, 我们通过结合最近提出的减少参数技术, 产生一个精密的DNN模型, 称为DsODENet, 其方法是将最近提出的降低参数技术结合起来: 神经标准( 常规差异) 和 DSC( 精密分解组合) 。 神经标准利用ResNet 和 ODE 之间的相似性, 并在多个层之间分享大部分重量参数, 大大降低存储消耗量。 我们将 dsODENet 应用于域适应, 作为图像分类数据集的一个实用应用实例。 我们还提议了基于资源的高效FPGA- 模型设计 dsmodeNet, 除了预处理层和后层之外, 所有的参数和特征图都可以在芯片记忆中绘制。 它在 Xilinx ZCUU104 董事会上实施, 并在域调精确度、 推导速度、 FPGA资源使用速度、 FFGA资源利用和可比较性域域域域比我们的标准比下, 将SRA 略地段执行速度比比的进度比软件使用。