PerfSAGE: 边缘设备任意深深学习模型一般推论性能预测仪 (PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices)

The ability to accurately predict deep neural network (DNN) inference performance metrics, such as latency, power, and memory footprint, for an arbitrary DNN on a target hardware platform is essential to the design of DNN based models. This ability is critical for the (manual or automatic) design, optimization, and deployment of practical DNNs for a specific hardware deployment platform. Unfortunately, these metrics are slow to evaluate using simulators (where available) and typically require measurement on the target hardware. This work describes PerfSAGE, a novel graph neural network (GNN) that predicts inference latency, energy, and memory footprint on an arbitrary DNN TFlite graph (TFL, 2017). In contrast, previously published performance predictors can only predict latency and are restricted to pre-defined construction rules or search spaces. This paper also describes the EdgeDLPerf dataset of 134,912 DNNs randomly sampled from four task search spaces and annotated with inference performance metrics from three edge hardware platforms. Using this dataset, we train PerfSAGE and provide experimental results that demonstrate state-of-the-art prediction accuracy with a Mean Absolute Percentage Error of <5% across all targets and model search spaces. These results: (1) Outperform previous state-of-art GNN-based predictors (Dudziak et al., 2020), (2) Accurately predict performance on accelerators (a shortfall of non-GNN-based predictors (Zhang et al., 2021)), and (3) Demonstrate predictions on arbitrary input graphs without modifications to the feature extractor.

翻译：精确预测深神经网络( DNN) 推导性能度量的能力, 如延缩、功率和记忆足迹, 对于目标硬件平台上的任意 DNN对于设计基于 DNN 的模型至关重要。这种能力对于具体硬件部署平台的( manual或自动) 设计、优化和部署实用 DNN 至关重要。不幸的是, 使用模拟器( 如有) 评估速度缓慢, 通常要求对目标硬件进行测量。这项工作描述了 PerfSAGE, 一个基于新颖的图形神经网络( GNNN), 用来预测任意的 DNNNNT TT TT 的透明性能、能量和存储性能足迹。相比之下, 先前公布的性能预测器只能预测性能, 并局限于特定硬件部署平台的预定义性能规则或搜索空间。本文还描述了 EdgeDLPerf数据集, 134, 912 DNNS 随机取样, 从四个任务搜索空间和三个边端硬件平台上加注的预测性性性能度度度度度度度度。。使用此数据设置, 我们培训了 MA- NSAD- sal- sloudal- s- sal- s- s- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- salviewal- sal- sal- sal- sal- sal- assional- sal- sal- sal- sal- assional- sal- sal- sal- salg- sal-h- sal- sal- ress- sal- sal- sal- sal- sal- sal- sal-h- sal- sal- sal- sal- sal- sal- ress- sal- ress- ress- sal- sal- sal-s-s- sal-s- sal- sal- sal- sal- sal- sal-s- sal- sal- sal- sal- sal- sal- sal- sal-s- sal- sal-