In this paper, we propose the task of consecutive question generation (CQG), which generates a set of logically related question-answer pairs to understand a whole passage, with a comprehensive consideration of the aspects including accuracy, coverage, and informativeness. To achieve this, we first examine the four key elements of CQG, i.e., question, answer, rationale, and context history, and propose a novel dynamic multitask framework with one main task generating a question-answer pair, and four auxiliary tasks generating other elements. It directly helps the model generate good questions through both joint training and self-reranking. At the same time, to fully explore the worth-asking information in a given passage, we make use of the reranking losses to sample the rationales and search for the best question series globally. Finally, we measure our strategy by QA data augmentation and manual evaluation, as well as a novel application of generated question-answer pairs on DocNLI. We prove that our strategy can improve question generation significantly and benefit multiple related NLP tasks.
翻译:在本文中,我们提出连续的问题生成(CQG)的任务,它产生一系列逻辑上相关的问答,以理解整个段落,并全面考虑包括准确性、覆盖面和信息性在内的各个方面。为了实现这一点,我们首先审查CQG的四个关键要素,即问题、答案、理由和背景历史,并提议一个新的动态多任务框架,其中一项主要任务产生一个问答配对,四项辅助任务产生其他要素。它直接帮助模型通过联合培训和自我排序产生良好的问题。同时,为了在特定段落中充分探索有价值的信息,我们利用重排损失来抽查全球最佳问题系列的理由和搜索。最后,我们用QA数据扩增和人工评估来衡量我们的战略,以及将生成的问答配对应用于DocNLI。我们证明,我们的战略可以大大改进问题生成,并有益于多个相关的NLP任务。