SRTMM3:动态实时动态多模型ML工作量动态调度器 (SDRM3: A Dynamic Scheduler for Dynamic Real-time Multi-model ML Workloads)

Emerging real-time multi-model ML (RTMM) workloads such as AR/VR and drone control often involve dynamic behaviors in various levels; task, model, and layers (or, ML operators) within a model. Such dynamic behaviors are new challenges to the system software in an ML system because the overall system load is unpredictable unlike traditional ML workloads. Also, the real-time processing requires to meet deadlines, and multi-model workloads involve highly heterogeneous models. As RTMM workloads often run on resource-constrained devices (e.g., VR headset), developing an effective scheduler is an important research problem. Therefore, we propose a new scheduler, SDRM3, that effectively handles various dynamicity in RTMM style workloads targeting multi-accelerator systems. To make scheduling decisions, SDRM3 quantifies the unique requirements for RTMM workloads and utilizes the quantified scores to drive scheduling decisions, considering the current system load and other inference jobs on different models and input frames. SDRM3 has tunable parameters that provide fast adaptivity to dynamic workload changes based on a gradient descent-like online optimization, which typically converges within five steps for new workloads. In addition, we also propose a method to exploit model level dynamicity based on Supernet for exploiting the trade-off between the scheduling effectiveness and model performance (e.g., accuracy), which dynamically selects a proper sub-network in a Supernet based on the system loads. In our evaluation on five realistic RTMM workload scenarios, SDRM3 reduces the overall UXCost, which is a energy-delay-product (EDP)-equivalent metric for real-time applications defined in the paper, by 37.7% and 53.2% on geometric mean (up to 97.6% and 97.1%) compared to state-of-the-art baselines, which shows the efficacy of our scheduling methodology.

翻译：新兴的实时多模ML(RTMM)工作量,如AR/VR和无人机控制等新兴的多模ML(RTMMM)工作量,往往涉及不同层次的动态行为;任务、模型和层(或ML操作员),在模型中。这种动态行为是对ML系统系统系统软件的新挑战,因为整个系统负荷与传统的ML工作量不同,无法预测。此外,实时处理需要满足最后期限,多模工作量涉及高度差异的模型。由于TRMM工作量往往在资源限制的设备(如,VR头)上运行,开发有效的调度器是一个重要的研究问题。因此,我们提出了一个新的调度器(SDRM3),该程序有效地处理RTMM的多种动态工作量,针对的是多加速系统。要做出时间安排决定,SDRM3,SM3 实时处理的模型要求对RTM工作量做出独特的要求,并使用量化的分数来驱动时间安排决定,考虑到当前系统负荷和不同模型和输入框架中的其他推力工作。SDRM3的可调度参数参数参数参数参数参数参数参数参数,可以提供快速适应动态的弹性调整,而快速适应动态的SDRFMMM3,在动态的弹性工作量中, 正常的进度系统里,在正常的进度系统里,在正常的进度中,在正常的进度系统里,在正常的进度系统里,在正常的进度上进行。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/