This initial version of this document was written back in 2014 for the sole purpose of providing fundamentals of reliability theory as well as to identify the theoretical types of machinery for the prediction of durability/availability of erasure-coded storage systems. Since the definition of a "system" is too broad, we specifically focus on warm/cold storage systems where the data is stored in a distributed fashion across different storage units with or without continuous operation. The contents of this document are dedicated to a review of fundamentals, a few major improved stochastic models, and several contributions of my work relevant to the field. One of the contributions of this document is the introduction of the most general form of Markov models for the estimation of mean time to failure. This work was partially later published in IEEE Transactions on Reliability. Very good approximations for the closed-form solutions for this general model are also investigated. Various storage configurations under different policies are compared using such advanced models. Later in a subsequent chapter, we have also considered multi-dimensional Markov models to address detached drive-medium combinations such as those found in optical disk and tape storage systems. It is not hard to anticipate such a system structure would most likely be part of future DNA storage libraries. This work is partially published in Elsevier Reliability and System Safety. Topics that include simulation modelings for more accurate estimations are included towards the end of the document by noting the deficiencies of the simplified canonical as well as more complex Markov models, due mainly to the stationary and static nature of Markovinity. Throughout the document, we shall focus on concurrently maintained systems although the discussions will only slightly change for the systems repaired one device at a time.
翻译:本文件的最初版本于2014年撰写,其唯一目的是提供可靠性理论的基本要素,并确定用于预测耐久性/耐久性/耐用去除编码储存系统的机制的理论类型。由于“系统”的定义过于宽泛,我们特别侧重于热/冷储存系统,这些数据以分布在不同储存单元的方式储存,有或没有连续运行。本文件的内容专门用于审查基本数据、少数经过重大改进的静态分析模型以及我与外地有关的工作的若干贡献。本文件的贡献之一是采用最一般形式的马尔科夫模型来估计中值时间到故障。由于“系统”的定义过于宽泛,我们特别侧重于热/冷储存系统,这些数据以分布在不同储存单元中,无论是否连续运行,都分布在不同储存单元中。本文件的内容专门用于审查基本数据、少数经过重大改进的静态模型以及我的工作。本文件的贡献之一是采用最一般格式的马科夫模型来估计中值的中位时间。这项工作后来在IEEE交易中部分公布。对于这一通用模型的精确性解决方案的精确性结构来说,很可能包括一个更精确性的文件格式,在Silal-deal Prodeal Prodeal laphy ser sal sal laphy sal sal sal sal sal sal sal laction sal laction lave laveds lavi lave laveds lavi lave lave lave lave lave lap lap lap lax lax lax lax lade laves laves lady lade lade lax lax lax lax lax lax lax lax lax lax lax lax lade lax lax labus labils lax lax lax labils labus lax lax labal lavi lax lavi lavi labus lavi labus lax lax lad lax lad lads lads laved lavi lavi lax lax lax lax la