AIR 的查询重构造是通过剩余项 驱动的,即暂时没有被前 i 个解释构成的集合覆盖的查询项的集合。其中, 表示查询项的非重复集合, 表示第 k 个解释的非重复项集合, 表示 i 个解释句子的集合。作者将对齐方式的软匹配用于包含操作:如果查询项与对齐项的余弦相似度大于相似度阈值 (作者为所有实验使用 = 0.95),则认为该查询项包含在对齐项中,从而确保两个术语在嵌入空间中相似。2.1.3 覆盖作者提出一个指标用于度量查询关键字被检索到的解释链 覆盖的程度:
[1] Vikas Yadav, Steven Bethard, and Mihai Surdeanu. 2019a. Alignment over heterogeneous embeddings for question answering. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Long Papers), Minneapolis, USA. Association for Computational Linguistics.
[2] Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543.[3] Daniel Khashabi, Snigdha Chaturvedi, Michael Roth, Shyam Upadhyay, and Dan Roth. 2018a. Looking beyond the surface: A challenge set for reading comprehension over multiple sentences. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 252–262.[4] Tushar Khot, Peter Clark, Michal Guerquin, Peter Jansen, and Ashish Sabharwal. 2019a. Qasc: A dataset for question answering via sentence composition. arXiv preprint arXiv:1910.11473.[5] Sun Kim, Nicolas Fiorini, W John Wilbur, and Zhiyong Lu. 2017. Bridging the gap: Incorporating a semantic similarity measure for effectively mapping pubmed queries to documents. Journal of biomedical informatics, 75:122–127.