Diagnostic tests are almost never perfect. Studies quantifying their performance use knowledge of the true health status, measured with a reference diagnostic test. Researchers commonly assume that the reference test is perfect, which is not the case in practice. When the assumption fails, conventional studies identify "apparent" performance or performance with respect to the reference, but not true performance. This paper provides the smallest possible bounds on the measures of true performance - sensitivity (true positive rate) and specificity (true negative rate), or equivalently false positive and negative rates, in standard settings. Implied bounds on policy-relevant parameters are derived: 1) Prevalence in screened populations; 2) Predictive values. Methods for inference based on moment inequalities are used to construct uniformly consistent confidence sets in level over a relevant family of data distributions. Emergency Use Authorization (EUA) and independent study data for the BinaxNOW COVID-19 antigen test demonstrate that the bounds can be very informative. Analysis reveals that the estimated false negative rates for symptomatic and asymptomatic patients are up to 3.17 and 4.59 times higher than the frequently cited "apparent" false negative rate.
翻译:诊断性测试几乎从来就不是十全十美。用参考诊断性测试来衡量,用其性能知识来量化其性能使用真实健康状况知识的研究,通常认为参考性测试是完美的,实际上并非如此。当假设失败时,常规研究确定参考性“明显”性能或性能,但不是真正的性能。本文为实际性能的测量提供了尽可能最小的界限――敏感度(实际正率)和特殊性(实际负率),或标准环境中的等同假正负率。根据政策相关参数的隐含界限得出:(1) 受检查人群的流行率;(2) 预测性值。基于时间不平等的推断方法被用来构建对有关数据分布大家庭的统一一致的可信度。紧急使用授权(EUA)和BinaxNOW COVID-19抗原试验的独立研究数据表明,这些界限可能非常丰富。分析表明,对症状和无症状患者的估计负率比经常引用的“明显负率”高3.17倍和4.59倍。