In the last months, due to the emergency of Covid-19, questions related to the fact of belonging or not to a particular class of individuals (`infected or not infected'), after being tagged as `positive' or `negative' by a test, have never been so popular. Similarly, there has been strong interest in estimating the proportion of a population expected to hold a given characteristics (`having or having had the virus'). Taking the cue from the many related discussions on the media, in addition to those to which we took part, we analyze these questions from a probabilistic perspective (`Bayesian'), considering several effects that play a role in evaluating the probabilities of interest. The resulting paper, written with didactic intent, is rather general and not strictly related to pandemics: the basic ideas of Bayesian inference are introduced and the uncertainties on the performances of the tests are treated using the metrological concepts of `systematics', and are propagated into the quantities of interest following the rules of probability theory; the separation of `statistical' and `systematic' contributions to the uncertainty on the inferred proportion of infectees allows to optimize the sample size; the role of `priors', often overlooked, is stressed, however recommending the use of `flat priors', since the resulting posterior distribution can be `reshaped' by an `informative prior' in a later step; details on the calculations are given, also deriving useful approximated formulae, the tough work being however done with the help of direct Monte Carlo simulations and Markov Chain Monte Carlo, implemented in R and JAGS (relevant code provided in appendix).
翻译:在过去几个月里,由于Covid-19的紧急情况,与属于或不属于某一类个人(`受感染或未受感染者')这一事实有关的问题,在被测试标为“正”或“负”之后,从未如此受欢迎,同样,人们对估计预期具有某种特性的人口比例(`有或有病毒')有着浓厚的兴趣;除了我们参与的讨论外,从媒体的许多相关讨论的提示中,我们从概率角度(“Bayesian”)分析这些问题,考虑到在评估利害概率方面起作用的若干影响,因此,由此产生的论文,用“积极意图”写成,相当笼统,与流行病没有严格相关:采用了巴耶斯推论的基本想法,用“系统学”的计量概念处理测试绩效的不确定性,并按照概率理论规则将这些问题传播到更难的程度;将“统计”和“系统化”的计算结果分为若干种影响。