We present the first end-to-end, transformer-based table question answering (QA) system that takes natural language questions and massive table corpus as inputs to retrieve the most relevant tables and locate the correct table cells to answer the question. Our system, CLTR, extends the current state-of-the-art QA over tables model to build an end-to-end table QA architecture. This system has successfully tackled many real-world table QA problems with a simple, unified pipeline. Our proposed system can also generate a heatmap of candidate columns and rows over complex tables and allow users to quickly identify the correct cells to answer questions. In addition, we introduce two new open-domain benchmarks, E2E_WTQ and E2E_GNQ, consisting of 2,005 natural language questions over 76,242 tables. The benchmarks are designed to validate CLTR as well as accommodate future table retrieval and end-to-end table QA research and experiments. Our experiments demonstrate that our system is the current state-of-the-art model on the table retrieval task and produces promising results for end-to-end table QA.
翻译:我们提出了第一个端到端、以变压器为基础的表答答(QA)系统,该系统将自然语言问题和大表文体作为检索最相关表格和找到正确表格单元格以回答问题的投入。我们的系统CLTR将目前最先进的QA扩展为表格模型,以建立一个端到端表格QA结构。这个系统以简单、统一的管道成功地解决了许多真实世界表格QA问题。我们提议的系统还可以在复杂的表格上产生候选人列和行的热映射,使用户能够迅速找到正确的单元格来回答问题。此外,我们引入了两个新的开放域基准,即E2E_WTQ和E2E_GNQ, 其中包括超过76 242个表格的2 005个自然语言问题。这些基准旨在验证CLTR,并适应今后的表格检索和端到端表格QA的研究和实验。我们的实验证明我们的系统是表格检索任务中目前最先进的模型,并为端到端的表格QA产生有希望的结果。