Surveillance research is of great importance for effective and efficient epidemiological monitoring of case counts and disease prevalence. Taking specific motivation from ongoing efforts to identify recurrent cases based on the Georgia Cancer Registry, we extend recently proposed "anchor stream" sampling design and estimation methodology. Our approach offers a more efficient and defensible alternative to traditional capture-recapture (CRC) methods by leveraging a relatively small random sample of participants whose recurrence status is obtained through a principled application of medical records abstraction. This sample is combined with one or more existing signaling data streams, which may yield data based on arbitrarily non-representative subsets of the full registry population. The key extension developed here accounts for the common problem of false positive or negative diagnostic signals from the existing data stream(s). In particular, we show that the design only requires documentation of positive signals in these non-anchor surveillance streams, and permits valid estimation of the true case count based on an estimable positive predictive value (PPV) parameter. We borrow ideas from the multiple imputation paradigm to provide accompanying standard errors, and develop an adapted Bayesian credible interval approach that yields favorable frequentist coverage properties. We demonstrate the benefits of the proposed methods through simulation studies, and provide a data example targeting estimation of the breast cancer recurrence case count among Metro Atlanta area patients from the Georgia Cancer Registry-based Cancer Recurrence Information and Surveillance Program (CRISP) database.
翻译:根据格鲁吉亚癌症登记处目前查明经常病例的努力,我们根据最近提出的“锚流”抽样设计和估算方法,扩大了最近提出的“锚流”抽样设计和估算方法。我们的方法为传统捕捉-抓获(CRC)方法提供了一种更高效和可辩驳的替代方法。我们的方法是利用相对较少的随机抽样,利用通过有原则地应用医疗记录抽取法获得重现状态的参与者的重现情况。这一抽样与一个或多个现有的信号数据流相结合,这些数据流可以产生基于任意非代表的完整登记人口组别的数据。我们在这里开发的关键扩展说明了现有数据流产生的错误正或负诊断信号的共同问题。我们特别表明,设计仅需要记录这些非锚流监测流中的积极信号,并允许根据可估计的积极预测值(PPV)参数,对真实案件数进行合理估计。我们借用多个信号流模式的想法,以提供相应的标准错误数据,并开发一个经调整的巴伊西亚裔可靠间隔方法,以产生有利的常年常态覆盖值。我们通过ARCRESIR数据库,展示了拟议的格鲁吉亚癌症数据库的模型。