Most IT systems depend on a set of configuration variables (CVs), expressed as a name/value pair that collectively define the resource allocation for the system. While the ill-effects of misconfiguration or improper resource allocation are well-known, there is no effective a priori metric to quantify the impact of the configuration on the desired system attributes such as performance, availability, etc. In this paper, we propose a \textit{Configuration Health Index} (CHI) framework specifically attuned to the performance attribute to capture the influence of CVs on the performance aspects of the system. We show how CHI, which is defined as a configuration scoring system, can take advantage of the domain knowledge and the available (but rather limited) performance data to produce important insights into the configuration settings. We compare the CHI with both well-advertised segmented non-linear models and state-of-the-art data-driven models, and show that the CHI not only consistently provides better results but also avoids the dangers of pure data drive approach which may predict incorrect behavior or eliminate some essential configuration variables from consideration.
翻译:大多数信息技术系统都依赖于一组配置变量(CVs),以名称/价值来表示,这些变量共同界定了系统的资源配置。虽然错误配置或资源分配不当的不良后果是众所周知的,但没有有效的先验性衡量标准来量化配置对预期系统属性(如性能、可用性等)的影响。在本文件中,我们提议了一个“textit{configation Health Index}”(CHI)框架,具体针对性能属性,以捕捉CVs对系统性能方面的影响。我们展示了作为配置评分系统的CHI如何利用域知识和现有(但相当有限)性能数据对配置设置产生重要洞察力。我们比较了“CHI”与被周知的分解非线模型和最新数据驱动模型,并表明“CHI”不仅一贯提供更好的结果,而且避免了纯粹数据驱动方法的危险,该方法可能预测不正确的行为或消除某些基本配置变量。