In order to satisfy timing constraints, modern real-time applications require massively parallel accelerators such as General Purpose Graphic Processing Units (GPGPUs). Generation after generation, the number of computing clusters made available in novel GPU architectures is steadily increasing, hence, investigating suitable scheduling approaches is now mandatory. Such scheduling approaches are related to mapping different and concurrent compute kernels within the GPU computing clusters, hence grouping GPU computing clusters into schedulable partitions. In this paper we propose novel techniques to define GPU partitions; this allows us to define suitable task-to-partition allocation mechanisms in which tasks are GPU compute kernels featuring different timing requirements. Such mechanisms will take into account the interference that GPU kernels experience when running in overlapping time windows. Hence, an effective and simple way to quantify the magnitude of such interference is also presented. We demonstrate the efficiency of the proposed approaches against the classical techniques that considered the GPU as a single, non-partitionable resource.
翻译:为了满足时间限制,现代实时应用需要大量平行的加速器,如通用图形处理器(GPGPUs)。一代又一代,在新型GPU结构中提供的计算集群数量正在稳步增加,因此,现在必须调查适当的排期办法。这种排期办法涉及在GPU计算组中绘制不同和同时的计算内核,从而将GPU计算组分组分组分为可排版分区。在本文件中,我们提出了界定GPU分区的新技术;这使我们能够确定适当的任务到部门分配机制,其中的任务是具有不同时间要求的GPU计算内核。这种机制将考虑到GPU内核在重叠时间窗口运行时所经历的干扰。因此,还提出了一种有效和简单的方式来量化这种干扰的程度。我们展示了针对将GPU视为单一、非可排版资源的传统技术的拟议办法的效率。