Cross-lingual summarization (CLS) is the task to produce a summary in one particular language for a source document in a different language. We introduce WikiMulti - a new dataset for cross-lingual summarization based on Wikipedia articles in 15 languages. As a set of baselines for further studies, we evaluate the performance of existing cross-lingual abstractive summarization methods on our dataset. We make our dataset publicly available here: https://github.com/tikhonovpavel/wikimulti
翻译:跨语言概括(CLS)是用一种特定语言为不同语言的原始文件制作摘要的任务。我们引入了WikiMulti,这是一套基于15种语言的维基百科文章的跨语言汇总的新数据集。作为一套进一步研究的基准,我们评估了我们数据集上现有的跨语言抽象汇总方法的绩效。我们在这里公布我们的数据集:https://github.com/tikhonovpavel/wikimulti。