Open-domain extractive question answering works well on textual data by first retrieving candidate texts and then extracting the answer from those candidates. However, some questions cannot be answered by text alone but require information stored in tables. In this paper, we present an approach for retrieving both texts and tables relevant to a question by jointly encoding texts, tables and questions into a single vector space. To this end, we create a new multi-modal dataset based on text and table datasets from related work and compare the retrieval performance of different encoding schemata. We find that dense vector embeddings of transformer models outperform sparse embeddings on four out of six evaluation datasets. Comparing different dense embedding models, tri-encoders with one encoder for each question, text and table, increase retrieval performance compared to bi-encoders with one encoder for the question and one for both text and tables. We release the newly created multi-modal dataset to the community so that it can be used for training and evaluation.
翻译:开放式采掘问题解答在文本数据上效果良好,先是检索候选文本,然后从这些候选人那里提取答案。 但是,有些问题不能单靠文本回答,而需要表格中储存的信息。 在本文中,我们提出了一个方法,通过将文本、表格和问题联合编码到一个矢量空间,来检索与某个问题相关的文本和表格和表格。为此,我们根据相关工作的文本和表格数据集创建一个新的多模式数据集,并比较不同编码 schemata的检索性能。我们发现,变压器模型的密集矢量嵌入比六个评价数据集中四个的分散嵌入要强。比较不同的密集嵌入模型、三编码器和每个问题、文本和表格的编码器,提高检索性能,比一个问题编码器和文本和表格的双编码器的检索性能。我们向社区发布新创建的多模式数据集,以便用于培训和评估。