审查通用集合二元体分布及其信息属性 (An examination of the generalised pooled binomial distribution and its information properties)

This paper examines the statistical properties of a distributional form that arises from pooled testing for the prevalence of a binary outcome. Our base distribution is a two-parameter distribution using a prevalence and excess intensity parameter; the latter is included to allow for a dilution or intensification effect with larger pools. We also examine a generalised form of the distribution where pools have covariate information that affects the prevalence through a linked linear form. We study the general pooled binomial distribution in its own right and as a special case of broader forms of binomial GLMs using the complementary log-log link function. We examine the information function and show the information content of individual sample items. We demonstrate that pooling reduces information content of sample units and we give simple heuristics for choosing an "optimal" pool size for testing. We derive the form of the log-likelihood function and its derivatives and give results for maximum likelihood estimation. We also discuss diagnostic testing of the positive pool probabilities, including testing for intensification/dilution in the testing mechanism. We illustrate the use of this distribution by applying it to pooled testing data on virus prevalence in a mosquito population.

翻译：本文审查了二元结果流行情况集合测试产生的分布式表格的统计特性。我们的基础分布是一个使用流行性和超强参数的双参数分布;后者包括允许与大型集合体产生稀释效应或强化效应;我们还审查了集合体具有通过链接线形形式影响流行情况的共变信息的分布式的概括形式;我们研究了一般集合二元球分布本身,并将其作为使用补充日志-log链接功能的较广泛形式的二元GLMs的特殊情况。我们研究了信息功能并展示了单个样本项目的信息内容。我们证明,集聚样本单元的信息内容会减少,我们给选择“最佳”集合体的测试提供简单的超常量。我们从日志功能及其衍生物的形式中推导出结果,并尽可能进行估计。我们还讨论了对正集合体概率的诊断性测试,包括在测试机制中进行强化/稀释试验。我们通过将其应用于蚊子群集病毒流行情况的综合检测数据来说明这种分布的使用情况。