Table-based reasoning has shown remarkable progress in combining deep models with discrete reasoning, which requires reasoning over both free-form natural language (NL) questions and structured tabular data. However, previous table-based reasoning solutions usually suffer from significant performance degradation on huge evidence (tables). In addition, most existing methods struggle to reason over complex questions since the required information is scattered in different places. To alleviate the above challenges, we exploit large language models (LLMs) as decomposers for effective table-based reasoning, which (i) decompose huge evidence (a huge table) into sub-evidence (a small table) to mitigate the interference of useless information for table reasoning; and (ii) decompose complex questions into simpler sub-questions for text reasoning. Specifically, we first use the LLMs to break down the evidence (tables) involved in the current question, retaining the relevant evidence and excluding the remaining irrelevant evidence from the huge table. In addition, we propose a "parsing-execution-filling" strategy to alleviate the hallucination dilemma of the chain of thought by decoupling logic and numerical computation in each step. Extensive experiments show that our method can effectively leverage decomposed evidence and questions and outperforms the strong baselines on TabFact, WikiTableQuestion, and FetaQA datasets. Notably, our model outperforms human performance for the first time on the TabFact dataset.
翻译:以表格为基础的推理表明,在将深度模型与离散推理相结合方面取得了显著进展,这要求对自由格式自然语言(NL)问题和结构化表格数据进行推理,然而,以往以表格为基础的推理方法通常因巨大的证据(表格)的性能大幅退化而受到影响;此外,由于所需信息分散在不同的地方,大多数现有方法都难以对复杂的问题进行理判;为缓解上述挑战,我们利用大型语言模型作为基于表格的有效推理的分解器,这需要(一)将大量证据(一个大表格)分解成次级证据(一个小表格),以减轻无用信息对表格推理的干扰;以及(二)将复杂的问题分解成简单的子问题,供文本推理。具体地说,我们首先使用LLMS来解析当前问题所涉证据(表),保留相关证据并将其余不相干的证据从大表格中排除。此外,我们建议采用“先定时间填充”的模型战略,通过分解逻辑和数字计算方法来减轻思维链中的幻困境。