The Glivenko-Cantelli theorem states that the empirical distribution function converges uniformly almost surely to the theoretical distribution for a random variable $X \in \mathbb{R}$. This is an important result because it establishes the fact that sampling does capture the dispersion measure the distribution function $F$ imposes. In essence, sampling permits one to learn and infer the behavior of $F$ by only looking at observations from $X$. The probabilities that are inferred from samples $\mathbf{X}$ will become more precise as the sample size increases and more data becomes available. Therefore, it is valid to study distributions via samples. The proof present here is constructive, meaning that the result is derived directly from the fact that the empirical distribution function converges pointwise almost surely to the theoretical distribution. The work includes a proof of this preliminary statement and attempts to motivate the intuition one gets from sampling techniques when studying the regions in which a model concentrates probability. The sets where dispersion is described with precision by the empirical distribution function will eventually cover the entire sample space.
翻译:Gliivenko- Cantelli 理论表示, 经验分配函数几乎一致地一致到随机变量 $X\ in\ mathbb{R}$ 的理论分布。 这是一个重要的结果, 因为它确定取样确实能捕捉分散量函数所强加的美元。 本质上, 取样允许人们通过只看X美元的观察来了解和推断美元的行为。 样品 $\ mathbf{X}$ 所推断的概率将随着样本规模的增加和更多的数据的出现而变得更加精确。 因此, 研究通过样本的分布是有效的。 这里的证据具有建设性, 意思是, 其结果直接来自经验分配函数几乎可以肯定地与理论分布相融合的事实。 这项工作包括这一初步说明的证据, 以及试图激发直觉的尝试, 是在研究模型集中概率的区域时从取样技术中得到的。 以经验分配函数准确描述的分散的数据集最终将覆盖整个样本空间 。