Modern computer systems need to execute under strict safety constraints (e.g., a power limit), but doing so often conflicts with their ability to deliver high performance (i.e. minimal latency). Prior work uses machine learning to automatically tune hardware resources such that the system execution meets safety constraints optimally. Such solutions monitor past system executions to learn the system's behavior under different hardware resource allocations before dynamically tuning resources to optimize the application execution. However, system behavior can change significantly between different applications and even different inputs of the same applications. Hence, the models learned using data collected a priori are often suboptimal and violate safety constraints when used with new applications and inputs. To address this limitation, we introduce the concept of an execution space, which is the cross product of hardware resources, input features, and applications. To dynamically and safely allocate hardware resources from the execution space, we present SCOPE, a resource manager that leverages a novel safe exploration framework. We evaluate SCOPE's ability to deliver improved latency while minimizing power constraint violations by dynamically configuring hardware while running a variety of Apache Spark applications. Compared to prior approaches that minimize power constraint violations, SCOPE consumes comparable power while improving latency by up to 9.5X. Compared to prior approaches that minimize latency, SCOPE achieves similar latency but reduces power constraint violation rates by up to 45.88X, achieving almost zero safety constraint violations across all applications.
翻译:现代计算机系统需要在严格的安全限制(例如,电力限制)下执行,但这样做往往与其交付高性能的能力发生冲突(例如,电力限制),但往往与其交付高性能的能力发生冲突(例如,电力限制)。在工作之前,使用机器学习自动调节硬件资源,以便自动调节硬件资源,使系统实施最优化。这些解决方案监测过去系统执行过程,以便在动态调整资源以优化应用执行之前,在不同的硬件资源分配下了解系统的行为,以优化应用执行。然而,系统行为可以在不同的应用程序之间发生重大变化,甚至同一应用程序的不同投入之间发生重大变化。因此,使用先天收集的数据所学的模型往往不优化,在使用新的应用程序和投入时违反安全限制。为了应对这一限制,我们采用机器学习来自动调节硬件资源,以便自动调节硬件资源资源,投入特性和应用;为了以动态和安全的方式分配系统系统系统资源,我们介绍SOPE的资源经理,利用新的安全勘探框架。我们评估SCOPE提供更好的弹性的能力,同时通过动态配置硬件将侵犯力限制最小化,同时使用各种Appie Spar Spring lapping laft laft laft laft laft lapper viol viol viol viol viol viol lax lax lax violtist to lating violvioltist violviolviolviolv