GPTPPPU: 使用边缘电温处理器加速应用 (GPTPU: Accelerating Applications using Edge Tensor Processing Units)

from arxiv, This paper is a pre-print of a paper in the 2021 SC, the International Conference for High Performance Computing, Networking, Storage and Analysis

Neural network (NN) accelerators have been integrated into a wide-spectrum of computer systems to accommodate the rapidly growing demands for artificial intelligence (AI) and machine learning (ML) applications. NN accelerators share the idea of providing native hardware support for operations on multidimensional tensor data. Therefore, NN accelerators are theoretically tensor processors that can improve system performance for any problem that uses tensors as inputs/outputs. Unfortunately, commercially available NN accelerators only expose computation capabilities through AI/ML-specific interfaces. Furthermore, NN accelerators reveal very few hardware design details, so applications cannot easily leverage the tensor operations NN accelerators provide. This paper introduces General-Purpose Computing on Edge Tensor Processing Units (GPTPU), an open-source, open-architecture framework that allows the developer and research communities to discover opportunities that NN accelerators enable for applications. GPTPU includes a powerful programming interface with efficient runtime system-level support -- similar to that of CUDA/OpenCL in GPGPU computing -- to bridge the gap between application demands and mismatched hardware/software interfaces. We built GPTPU machine uses Edge Tensor Processing Units (Edge TPUs), which are widely available and representative of many commercial NN accelerators. We identified several novel use cases and revisited the algorithms. By leveraging the underlying Edge TPUs to perform tensor-algorithm-based compute kernels, our results reveal that GPTPU can achieve a 2.46x speedup over high-end CPUs and reduce energy consumption by 40%.

翻译：神经网络加速器(NNN)已被纳入计算机系统的广频中,以满足对人工智能(AI)和机器学习(ML)应用程序的快速增长需求。 NN加速器分享了为多维高压数据操作提供本地硬件支持的想法。因此,NN加速器在理论上是高压处理器,可以改进系统性能,解决任何使用高压作为投入/产出的问题。不幸的是,商业上可用的NNNC加速器只能通过AI/ML特定界面暴露计算能力。此外,NNC加速器显示的硬件设计细节很少,因此应用程序无法轻易利用NNNNNC加速器提供的特殊操作硬件支持。本文介绍GPT处理器(GPTPU)的通用计算器,一个开放源和开放的处理器处理器处理器,使开发者和研究界能够发现NCWC加速器为应用程序提供的机会。 GPTPT包括一个强大的程序界面运行平台接口,类似于CUDA/OFER 快速处理器(GPL) 高压(GPU) 高压(OG) 高压(OPL) 高压(OD) 高压(OUD) 高压(W) 高压(OUT) 高压(OFT) 高压(GPLD) 的算) 高压(GPLT) 高压(G) 机)。