Heterogeneous computing systems, which combine general-purpose processors with specialized accelerators, are increasingly important for optimizing the performance of modern applications. A central challenge is to decide which parts of an application should be executed on which accelerator or, more generally, how to map the tasks of an application to available devices. Predicting the impact of a change in a task mapping on the overall makespan is non-trivial. While there are very capable simulators, these generally require a full implementation of the tasks in question, which is particularly time-intensive for programmable logic. A promising alternative is to use a purely analytical function, which allows for very fast predictions, but abstracts significantly from reality. Bridging the gap between theory and practice poses a significant challenge to algorithm developers. This paper aims to aid in the development of rapid makespan prediction algorithms by providing a highly flexible evaluation framework for heterogeneous systems consisting of CPUs, GPUs and FPGAs, which is capable of collecting real-world makespan results based on abstract task graph descriptions. We analyze to what extent actual makespans can be predicted by existing analytical approaches. Furthermore, we present common challenges that arise from high-level characteristics such as data transfer overhead and device congestion in heterogeneous systems.
翻译:暂无翻译