As Deep Learning continues to drive a variety of applications in edge and cloud data centers, there is a growing trend towards building large accelerators with several sub-accelerator cores/chiplets. This work looks at the problem of supporting multi-tenancy on such accelerators. In particular, we focus on the problem of mapping jobs from several DNNs simultaneously on an accelerator. Given the extremely large search space, we formulate the search as an optimization problem and develop an optimization framework called M3E. In addition, we develop a specialized optimization algorithm called MAGMA with custom operators to enable structured sample-efficient exploration. We quantitatively compare MAGMA with several state-of-the-art methods, black-box optimization, and reinforcement learning methods across different accelerator settings (large/small accelerators) and different sub-accelerator configurations (homogeneous/heterogeneous), and observe MAGMA can consistently find better mappings.
翻译:随着深学习继续推动边缘和云中数据中心的各种应用,正在逐步形成一种趋势,即以多个次加速核心/芯片建立大型加速器。这项工作着眼于支持这些加速器的多重加速度问题。特别是,我们侧重于在加速器上同时从几个 DNN 中同时绘制工作表的问题。鉴于搜索空间非常大,我们将搜索设计成一个优化问题,并开发一个称为M3E的优化框架。此外,我们开发了一个名为MAGMA 的专门优化算法,与定制操作员一起进行结构化的样本高效探索。我们在数量上将MAGMA与不同加速器环境(大型/小型加速器)和不同的子加速器配置(多基因/多基因)的几种最先进的方法、黑盒优化和强化学习方法进行比较,并不断观察到MAGMAA可以找到更好的绘图方法。