When utilized effectively, Supercloud heterogeneous systems have the potential to significantly enhance performance. Our ReDSEa tool-chain automates the mapping, load balancing, scheduling, parallelism, and overlapping processes for the Triangular System Solver (TS) on a heterogeneous system consisting of a Huawei Kunpeng ARM multi-core CPU and an Ascend 910 AI HW accelerator. We propose an LLVM compiler tool-chain that a) leverages compiler analysis and b) utilizes novel performance models exploring recursive, iterative, and blocked computation models. Our tool-chain facilitates a speedup of up to 16x compared to an optimized 48-core CPU-only implementation.
翻译:暂无翻译