We propose a multilingual data-driven method for generating reading comprehension questions using dependency trees. Our method provides a strong, mostly deterministic, and inexpensive-to-train baseline for less-resourced languages. While a language-specific corpus is still required, its size is nowhere near those required by modern neural question generation (QG) architectures. Our method surpasses QG baselines previously reported in the literature and shows a good performance in terms of human evaluation.
翻译:我们提出了一种多语言数据驱动方法,用于利用依赖树来生成阅读理解问题。我们的方法为资源不足的语言提供了一个强有力的、多为决定性的、低价到低廉的学习基线。虽然仍然需要一个语言专有体系,但其规模远远低于现代神经问题生成结构的要求。我们的方法超过了文献中以前报告的QG基线,在人类评估方面表现良好。