Cloud computing is the backbone of the digital society. Digital banking, media, communication, gaming, and many others depend on cloud services. Unfortunately, cloud services may fail, leading to damaged services, unhappy users, and perhaps millions of dollars lost for companies. Understanding a cloud service failure requires a detailed report on why and how the service failed. Previous work studies how cloud services fail using logs published by cloud operators. However, information is lacking on how users perceive and experience cloud failures. Therefore, we collect and characterize the data for user-reported cloud failures from Down Detector for three cloud service providers over three years. We count and analyze time patterns in the user reports, and derive failures from those user reports and characterize their duration and interarrival time. We characterize provider-reported cloud failures and compare the results with the characterization of user-reported failures. The comparison reveals the information of how users perceive failures and how much of the failures are reported by cloud service providers. Overall, this work provides a characterization of user- and provider-reported cloud failures and compares them with each other.
翻译:暂无翻译