The min-entropy is a widely used metric to quantify the randomness of generated random numbers, which measures the difficulty of guessing the most likely output. It is difficult to accurately estimate the min-entropy of a non-independent and identically distributed (non-IID) source. Hence, NIST Special Publication (SP) 800-90B adopts ten different min-entropy estimators and then conservatively selects the minimum value among ten min-entropy estimates. Among these estimators, the longest repeated substring (LRS) estimator estimates the collision entropy instead of the min-entropy by counting the number of repeated substrings. Since the collision entropy is an upper bound on the min-entropy, the LRS estimator inherently provides \emph{overestimated} outputs. In this paper, we propose two techniques to estimate the min-entropy of a non-IID source accurately. The first technique resolves the overestimation problem by translating the collision entropy into the min-entropy. Next, we generalize the LRS estimator by adopting the general R{\'{e}}nyi entropy instead of the collision entropy (i.e., R{\'{e}}nyi entropy of order two). We show that adopting a higher order can reduce the variance of min-entropy estimates. By integrating these techniques, we propose a generalized LRS estimator that effectively resolves the overestimation problem and provides stable min-entropy estimates. Theoretical analysis and empirical results support that the proposed generalized LRS estimator improves the estimation accuracy significantly, which makes it an appealing alternative to the LRS estimator.
翻译:最小值是用来量化生成随机数字随机性的常用度量, 用来测量最可能测算输出的难度。 很难准确估计不独立且分布相同的 L- IID 源的最小值。 因此, NIST 特殊出版物( SP) 800- 90B 采用了 10 种不同的 最小值测量器, 然后保守地选择了 10 个 最小值估计的最小值 。 在这些测量器中, 最长时间的重复 基流精确度( LRS) 估测( LRS ) 估计了相撞加速度, 而不是最小值估计。 计算了重复的 亚细度估计数。 由于碰撞最小值是最小值, LRS 估计本身就提供了10 个不同的最小值估测器 。 在本文中, 我们提出两种方法可以准确估计 。 将相撞精确度的精确度问题通过将 Rentropy 值转换为最小值 。 下一步, 我们一般地将 LRS IM 排序 的 显示 。