In the medical domain, a Systematic Literature Review (SLR) attempts to collect all empirical evidence, that fit pre-specified eligibility criteria, in order to answer a specific research question. The process of preparing an SLR consists of multiple tasks that are labor-intensive and time-consuming, involving large monetary costs. Technology-assisted review (TAR) methods automate the different processes of creating an SLR and they are particularly focused on reducing the burden of screening for reviewers. We present a novel method for TAR that implements a full pipeline from the research protocol to the screening of the relevant papers. Our pipeline overcomes the need of a Boolean query constructed by specialists and consists of three different components: the primary retrieval engine, the inter-review ranker and the intra-review ranker, combining learning-to-rank techniques with a relevance feedback method. In addition, we contribute an updated version of the Task 2 of the CLEF 2019 eHealth Lab dataset, which we make publicly available. Empirical results on this dataset show that our approach can achieve state-of-the-art results.
翻译:在医学领域,系统文学审查试图收集所有符合预先规定的资格标准的经验证据,以便回答具体的研究问题;编制一个系统文学审查的过程包括劳力密集和耗时的多种任务,涉及巨额的金钱费用;技术辅助审查方法使创建SLR的不同过程自动化,特别侧重于减少审查员的筛选负担;我们为TAR提出了一个新方法,从研究协议中将充分管道用于筛选相关文件;我们的管道克服了专家建造的由三个不同组成部分组成的布尔恩查询的需要:主要检索引擎、审查排级和内部审查排级,将学习到排位的技术与相关的反馈方法相结合;此外,我们提供了2019年电子健康实验室数据集第二任务的最新版本,我们公开提供该数据。