Pyserini is an easy-to-use Python toolkit that supports replicable IR research by providing effective first-stage retrieval in a multi-stage ranking architecture. Our toolkit is self-contained as a standard Python package and comes with queries, relevance judgments, pre-built indexes, and evaluation scripts for many commonly used IR test collections. We aim to support, out of the box, the entire research lifecycle of efforts aimed at improving ranking with modern neural approaches. In particular, Pyserini supports sparse retrieval (e.g., BM25 scoring using bag-of-words representations), dense retrieval (e.g., nearest-neighbor search on transformer-encoded representations), as well as hybrid retrieval that integrates both approaches. This paper provides an overview of toolkit features and presents empirical results that illustrate its effectiveness on two popular ranking tasks. We also describe how our group has built a culture of replicability through shared norms and tools that enable rigorous automated testing.
翻译:Pyserini是一个容易使用的Python工具包,它通过在多级排名结构中提供有效的第一阶段检索,支持可复制的IR研究。我们的工具包作为标准的Python软件包自成一体,并附有许多常用IR测试收藏的查询、相关性判断、预建索引和评价脚本。我们的目标是从方框外支持旨在提高现代神经方法排名的整个研究生命周期。特别是,Pyserini支持稀有的检索(例如,用字包表示的BM25评分)、密集的检索(例如,在变压器-编码显示上最接近的邻居搜索)以及结合这两种方法的混合检索。本文概述了工具包的特征,并介绍了表明其在两项流行的排名任务上的有效性的经验性结果。我们还介绍了我们集团如何通过能够进行严格的自动测试的共同规范和工具,建立起一种可复制的文化。