Pre-trained and fine-tuned transformer models like BERT and T5 have improved the state of the art in ad-hoc retrieval and question-answering, but not as yet in high-recall information retrieval, where the objective is to retrieve substantially all relevant documents. We investigate whether the use of transformer-based models for reranking and/or featurization can improve the Baseline Model Implementation of the TREC Total Recall Track, which represents the current state of the art for high-recall information retrieval. We also introduce CALBERT, a model that can be used to continuously fine-tune a BERT-based model based on relevance feedback.
翻译:诸如BERT和T5等经过预先培训和经过微调的变压器模型改进了临时检索和问答方面的先进水平,但在高回调信息检索方面还没有改进,因为高回调信息检索的目的是要大量检索所有相关文件。我们调查使用基于变压器的变压器模型进行重新排序和(或)编造是否可改进TREC总回调轨基线模型的实施,该模型代表了当前高回调信息检索的先进水平。我们还引入了CALBERT模型,该模型可用于不断根据相关反馈对基于BERT的模型进行微调。