Standard (network) meta-analysis methods for medical test accuracy evaluation analyse the data separately for each test threshold - wasting data - unless every study reports all thresholds. Previously proposed "multiple threshold" models either fail to provide threshold-specific summary estimates, or they assume that ordinal tests (e.g., questionnaires) are continuous. We propose two ordinal regression models - ordinal-bivariate and ordinal-HSROC - using an induced-Dirichlet framework for cutpoint parameters, enabling intuitive priors and both fixed-effects and random-effects cutpoints. We conducted a simulation study to evaluate the performance of our proposed models, with the simulated data being based on real anxiety screening data spanning 7, 22, and 64 ordinal categories, with 15%, 40% and 55% missing threshold data. Our proposed ordinal-bivariate model with fixed-effect cutpoints tended to obtain the best RMSE and bias, including when data was generated from a recently proposed continuous-assumption model. For instance - even with 64 categories - continuous models performed 10%-30% worse than our models, contradicting the common assumption that many categories justify treating ordinal tests as continuous. Furthermore, the standard stratified-bivariate approach showed worse performance, especially for tests with higher missingness. We implemented the models in the MetaOrdDTA R package (https://github.com/CerulloE1996/MetaOrdDTA), which provides features such as: Stan estimation, K-fold cross-validation for model selection, meta-regression, network meta-analysis extensions, and visualisation tools including sROC plots with credible/prediction regions. Overall, our simulation study suggests that our proposed models may obtain better accuracy estimates than previous approaches for ordinal tests, even when the number of ordinal categories is very high.
翻译:用于医学测试准确性评估的标准(网络)荟萃分析方法通常针对每个测试阈值分别分析数据——除非每项研究报告所有阈值,否则会造成数据浪费。先前提出的“多阈值”模型要么无法提供阈值特异性汇总估计,要么假设序数测试(如问卷)是连续的。我们提出了两种序数回归模型——序数双变量模型和序数HSROC模型——使用诱导狄利克雷框架处理切点参数,支持直观先验以及固定效应和随机效应切点。我们进行了模拟研究以评估所提模型的性能,模拟数据基于包含7、22和64个序数类别的真实焦虑筛查数据,其中阈值数据缺失率分别为15%、40%和55%。我们提出的具有固定效应切点的序数双变量模型倾向于获得最佳的均方根误差和偏差,即使在数据生成自近期提出的连续假设模型时也是如此。例如,即使存在64个类别,连续模型的性能仍比我们的模型差10%-30%,这与“类别数量多即可将序数测试视为连续”的常见假设相悖。此外,标准分层双变量方法表现出更差的性能,尤其在缺失率较高的测试中。我们在MetaOrdDTA R包(https://github.com/CerulloE1996/MetaOrdDTA)中实现了这些模型,该包提供以下功能:Stan估计、用于模型选择的K折交叉验证、荟萃回归、网络荟萃分析扩展以及可视化工具(包括带有可信/预测区域的sROC图)。总体而言,我们的模拟研究表明,对于序数测试,即使序数类别数量非常高,我们提出的模型也可能获得比先前方法更准确的估计。