In order to identify the infected individuals of a population, their samples are divided in equally sized groups called pools and a single laboratory test is applied to each pool. Individuals whose samples belong to pools that test negative are declared healthy, while each pool that tests positive is divided into smaller, equally sized pools which are tested in the next stage. In the $(k+1)$-th stage all remaining samples are tested. If $p<1-3^{-1/3}$, we minimize the expected number of tests per individual as a function of the number $k+1$ of stages, and of the pool sizes in the first $k$ stages. We show that for each $p\in (0, 1-3^{-1/3})$ the optimal choice is one of four possible schemes, which are explicitly described. We conjecture that for each $p$, the optimal choice is one of the two sequences of pool sizes $(3^k\text{ or }3^{k-1}4,3^{k-1},\dots,3^2,3 )$, with a precise description of the range of $p$'s where each is optimal. The conjecture is supported by overwhelming numerical evidence for $p>2^{-51}$. We also show that the cost of the best among the schemes $(3^k,\dots,3)$ is of order $O\big(p\log(1/p)\big)$, comparable to the information theoretical lower bound $p\log_2(1/p)+(1-p)\log_2(1/(1-p))$, the entropy of a Bernoulli$(p)$ random variable.
翻译:为了识别人口中的受感染者,他们的样本按同等规模的2(k)+1美元组别,称为池,对每个池进行单一的实验室测试。样本属于测试为负数的池的个人被宣布为健康,而每个测试为正数的池被分为在下一阶段测试的较小、同等规模的池。在$(k+1)的阶段,所有其余的样本都经过测试。如果美元(k+1) 美元(k+1美元),则每个样本的预期数量以美元+1美元(k+1美元)的函数来最小化(k+1美元)和第一个阶段的池规模。我们显示,对于每个测试为负数(0,1,1,1,1,1,1,1,1,3,3美元/3美元)的数值。我们推测,每个测试的最佳选择是美元(3,ktro)的两序列之一,即3,4,3,3,2,2,2,2,3,3,3,3,3,3,3,3,3,3美元,3,3,3,3,2,2,2,2,1,1,1,1,1,1,2,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,