Chiplets have become a common methodology in modern chip design. Chiplets improve yield and enable heterogeneity at the level of cores, memory subsystem and the interconnect. Convolutional Neural Networks (CNNs) have high computational, bandwidth and memory capacity requirements owing to the increasingly large amount of weights. Thus to exploit chiplet-based architectures, CNNs must be optimized in terms of scheduling and workload distribution among computing resources. We propose Shisha, an online approach to generate and schedule parallel CNN pipelines on chiplet architectures. Shisha targets heterogeneity in compute performance and memory bandwidth and tunes the pipeline schedule through a fast online exploration technique. We compare Shisha with Simulated Annealing, Hill Climbing and Pipe-Search. On average, the convergence time is improved by ~35x in Shisha compared to other exploration algorithms. Despite the quick exploration, Shisha's solution is often better than that of other heuristic exploration algorithms.
翻译:芯片已成为现代芯片设计的一个常见方法。芯片提高了产量,使核心、记忆子系统和相互连接水平的异质化成为可能。进化神经网络(CNNs)由于重量的不断增加而具有很高的计算、带宽和记忆能力要求。因此,为了开发芯片结构,CNN必须在计算资源的时间安排和工作量分配方面实现优化。我们提议采用Shisha(一种在线方法,在芯片结构上生成和安排CNN平行管道)。Shisha针对计算性能和记忆带宽的异质性,并通过快速在线勘探技术调整管道时间表。我们将Shisha与模拟的Annaaling、Hill爬升和管道-Search(管道-管道-Search)相比,平均而言,Shisha的~35x(Shisha)与其他探索算法相比,合并时间会得到改善。尽管快速探索, Shisha的解决方案往往比其他超理论探索算法要好。