Question answering (QA) has demonstrated impressive progress in answering questions from customized domains. Nevertheless, domain adaptation remains one of the most elusive challenges for QA systems, especially when QA systems are trained in a source domain but deployed in a different target domain. In this work, we investigate the potential benefits of question classification for QA domain adaptation. We propose a novel framework: Question Classification for Question Answering (QC4QA). Specifically, a question classifier is adopted to assign question classes to both the source and target data. Then, we perform joint training in a self-supervised fashion via pseudo-labeling. For optimization, inter-domain discrepancy between the source and target domain is reduced via maximum mean discrepancy (MMD) distance. We additionally minimize intra-class discrepancy among QA samples of the same question class for fine-grained adaptation performance. To the best of our knowledge, this is the first work in QA domain adaptation to leverage question classification with self-supervised adaptation. We demonstrate the effectiveness of the proposed QC4QA with consistent improvements against the state-of-the-art baselines on multiple datasets.
翻译:问题解答(QA)在回答自定制域的问题方面取得了令人印象深刻的进展,然而,域适应仍然是质量保证系统最难以应对的挑战之一,特别是当质量解答系统在源域内受过培训,但部署在不同的目标域时。在这项工作中,我们调查质解分类对质解领域适应的潜在好处。我们提议了一个新颖的框架:问题解答(QC4QA)问题分类。具体地说,通过问题分类将问题分类既分配给源数据,又分配给目标数据。然后,我们通过假标签以自我监督的方式进行联合培训。为了优化,源域和目标域之间的差异通过最大平均差异(MMD)距离缩小。我们进一步将同一问题分类样本之间的类别内部差异最小化,以便进行细微的适应性能。我们最了解的是,这是QA域调整的首项工作,以便利用自强化的适应问题分类。我们展示了拟议的QC4QA的有效性,在与州级多级基线数据上不断改进。我们展示了拟议的QC4QA的有效性。