Distributed systems in general and cloud systems in particular, are susceptible to failures that can lead to substantial economic and data losses, security breaches, and even potential threats to human safety. Software ageing is an example of one such vulnerability. It emerges due to routine re-usage of computational systems units which induce fatigue within the components, resulting in an increased failure rate and potential system breakdown. Due to its stochastic nature, ageing cannot be directly measured, instead ageing indicators as proxies are used. While there are dozens of studies on different ageing indicators, their comprehensive comparison in different settings remains underexplored. In this paper, we compare two ageing indicators in OpenStack as a use case. Specifically, our evaluation compares memory usage (including swap memory) and request response time, as readily available indicators. By executing multiple OpenStack deployments with varying configurations, we conduct a series of experiments and analyze the ageing indicators. Comparative analysis through statistical tests provides valuable insights into the strengths and weaknesses of the utilised ageing indicators. Finally, through an in-depth analysis of other OpenStack failures, we identify underlying failure patterns and their impact on the studied ageing indicators.
翻译:暂无翻译