Inverted indexes allow to query large databases without needing to search in the database at each query. An important line of research is to construct the most efficient inverted indexes, both in terms of compression ratio and time efficiency. In this article, we show how to use trit encoding, combined with contextual methods for computing inverted indexes. We perform an extensive study of different variants of these methods and show that our method consistently outperforms the Binary Interpolative Method -- which is one of the golden standards in this topic -- with respect to compression size. We apply our methods to a variety of datasets and make available the source code that produced the results, together with all our datasets.
翻译:反向索引允许查询大型数据库,而无需在每一个查询的数据库中搜索。重要的研究线是建立最有效的反向索引,既包括压缩率,也包括时间效率。在本篇文章中,我们展示了如何使用三元编码,加上用于计算反向索引的背景方法。我们对这些方法的不同变量进行了广泛的研究,并表明我们的方法始终优于二元内插方法 -- -- 这是这个主题的黄金标准之一 -- -- 在压缩大小方面。我们将我们的方法应用于各种数据集,并提供产生结果的源代码,以及我们所有数据集。