The exponential growth of scientific production makes secondary literature abridgements increasingly demanding. We introduce a new open-source framework for systematic reviews that significantly reduces time and workload for collecting and screening scientific literature. The framework provides three main tools: 1) an automatic citation search engine and manager that collects records from multiple online sources with a unified query syntax, 2) a Bayesian, active machine learning, citation screening tool based on iterative human-machine interaction to increase predictive accuracy and, 3) a semi-automatic, data-driven query generator to create new search queries from existing citation data sets. To evaluate the automatic screener's performance, we estimated the median posterior sensitivity and efficiency [90% Credible Intervals] using Bayesian simulation to predict the distribution of undetected potentially relevant records. Tested on an example topic, the framework collected 17,755 unique records through the citation manager; 766 records required human evaluation while the rest were excluded by the automatic classifier; the theoretical efficiency was 95.6% [95.3%, 95.7%] with a sensitivity of 100% [93.5%, 100%]. A new search query was generated from the labelled dataset, and 82,579 additional records were collected; only 567 records required human review after automatic screening, and six additional positive matches were found. The overall expected sensitivity decreased to 97.3% [73.8%, 100%] while the efficiency increased to 98.6% [98.2%, 98.7%]. The framework can significantly reduce the workload required to conduct large literature reviews by simplifying citation collection and screening while demonstrating exceptional sensitivity. Such a tool can improve the standardization and repeatability of systematic reviews.
翻译:科学生产的指数性增长使第二文献的缩略要求日益高高。我们为系统审查引入了新的开放源码框架,以大幅缩短收集和筛选科学文献的时间和工作量。框架提供了三大工具:(1) 自动引用搜索引擎和管理员,以统一的查询语法从多个在线来源收集记录;(2) 巴伊西亚、活跃的机器学习和引用筛选工具,以迭接的人体机器互动为基础,提高预测准确性;(3) 半自动、数据驱动的查询生成器,以从现有的引用数据集中创建新的查询查询查询。为了评估自动屏幕的性能,我们利用巴伊西亚模拟估算了中位海面敏感度和效率[90%的可读性之间值],以预测未检测的潜在相关记录的分配情况。以实例为例,框架通过引用管理员收集了17 755个独特记录;766个记录需要人类评价,而其余部分则由自动分类器排除;理论效率为95.6%[9.53%、95.7 %],其敏感度为100-3.5%、100.3%的超值之间。 新的搜索记录在收集过程中仅收集了82%的准确性记录,而新的搜索记录在5次更新后又增加了5次。