Random uniform sampling has been studied in various statistical tasks but few of them have covered the Q-error metric for cardinality estimation (CE). In this paper, we analyze the confidence intervals of random uniform sampling with and without replacement for single-table CE. Results indicate that the upper Q-error bound depends on the sample size and true cardinality. Our bound gives a rule-of-thumb for how large a sample should be kept for single-table CE.
翻译:在各种统计任务中,对随机统一抽样进行了研究,但其中很少涉及基本估计(CE)的Q-error衡量标准。 在本文件中,我们分析了随机统一抽样与单表CE的互信间隔,不替换单表CE。 结果表明,上Q-error受约束取决于样本大小和真实的基度。 我们的界限给出了对单表CE应保留多少样本的定律。