We present a novel task scheduling scheme for accelerating computational applications involving distributed iterative processes that are executed on networked computing resources. Such an application consists of multiple tasks, each of which outputs data at each iteration to be processed by neighboring tasks; these dependencies between the tasks can be represented as a directed graph. We first mathematically formulate the problem as a Binary Quadratic Program (BQP), accounting for both computation and communication costs. We show that the problem is NP-hard. We then relax the problem as a Semi-Definite Program (SDP) and utilize a randomized rounding technique based on sampling from a suitably-formulated multi-variate Gaussian distribution. Furthermore, we derive the expected value of bottleneck time. Finally, we apply our proposed scheme on gossip-based federated learning as an application of iterative processes. Through numerical evaluations on the MNIST and CIFAR-10 datasets, we show that our proposed approach outperforms well-known scheduling techniques from distributed computing. In particular, for arbitrary settings, we show that it reduces bottleneck time by $91\%$ compared to HEFT and $84\%$ compared to throughput HEFT.
翻译:我们提出了一个新的任务时间安排计划,用于加速计算应用,其中涉及分布式迭代程序,在网络计算资源中执行。这种应用由多种任务组成,每个任务都是由相邻任务处理的迭代的输出数据;任务之间的这些依赖性可以作为定向图表来表示。我们首先将问题以二进制二次二次二次二次曲线程序(BQP)来表述,同时计算和通信成本。我们显示问题在于NP-硬性;然后,我们放松问题,将其作为一个半确定程序(SDP),并采用基于从适当制定的多变式高斯分布中取样的随机四舍五入技术。此外,我们还得出了瓶装时间的预期值。最后,我们运用了我们提议的关于八道结合学习的办法,作为迭代过程的一种应用。我们通过对MNIST和CIFAR-10数据集进行数字评估,我们显示我们所提议的方法比分布式计算机的广为人知的列表技术要优于。对于任意的设置,我们显示它比HEF4和GEF4相比,将瓶值减少9.1美元至8美元。