Table Question Answering (TQA) is an important but under-explored task. Most of the existing QA datasets are in unstructured text format and only few of them use tables as the context. To the best of our knowledge, none of TQA datasets exist in the biomedical domain where tables are frequently used to present information. In this paper, we first curate a table question answering dataset, BioTABQA, using 22 templates and the context from a biomedical textbook on differential diagnosis. BioTABQA can not only be used to teach a model how to answer questions from tables but also evaluate how a model generalizes to unseen questions, an important scenario for biomedical applications. To achieve the generalization evaluation, we divide the templates into 17 training and 5 cross-task evaluations. Then, we develop two baselines using single and multi-tasks learning on BioTABQA. Furthermore, we explore instructional learning, a recent technique showing impressive generalizing performance. Experimental results show that our instruction-tuned model outperforms single and multi-task baselines on an average by ~23% and ~6% across various evaluation settings, and more importantly, instruction-tuned model outperforms baselines by ~5% on cross-tasks.
翻译:问题解答( TQA) 是一项重要但探索不足的任务。 现有的 质答( TQA) 数据集大多是非结构化的文本格式, 其中只有极少数使用表格作为上下文。 据我们所知, 在生物医学领域, 通常使用表格来提供信息, 不存在任何 TQA 数据集。 在本文中, 我们首先用22个模板和生物医学教科书中关于差异诊断的上下文来整理一个表解解数据集。 BioTABQA 不仅可以用来教授一个解答表格问题的模型, 还可以用来评价一个模型如何将一个模型一般化为隐性问题, 这是生物医学应用的一个重要情景。 为了实现总体化评估, 我们将模板分为17个培训和5个交叉任务评估。 然后, 我们用在BioTABQA 上单项和多任务学习来开发两个基线。 此外, 我们探索教学学习, 一种最近显示令人印象深刻一般化业绩的技术。 BioTABQA 实验结果显示, 我们的模块在平均的单项和多任务基线上超越了单项和多任务基线, 按~ 5 % 跨级设置 。