多种语文一个回答问题的模式,有跨语种的感感感通行证检索 (One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval) - 专知论文

会员服务 ·

0

自动问答 · MoDELS · 自回归生成模型 · 数据可用性 · 生成模型 ·

2021 年 10 月 28 日

One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval

翻译：多种语文一个回答问题的模式,有跨语种的感感感通行证检索

Akari Asai,Xinyan Yu,Jungo Kasai,Hannaneh Hajishirzi

from arxiv, Published as a conference paper at NeurIPS 2021. Our code and trained model are publicly available at https://github.com/AkariAsai/CORA

We present Cross-lingual Open-Retrieval Answer Generation (CORA), the first unified many-to-many question answering (QA) model that can answer questions across many languages, even for ones without language-specific annotated data or knowledge sources. We introduce a new dense passage retrieval algorithm that is trained to retrieve documents across languages for a question. Combined with a multilingual autoregressive generation model, CORA answers directly in the target language without any translation or in-language retrieval modules as used in prior work. We propose an iterative training method that automatically extends annotated data available only in high-resource languages to low-resource ones. Our results show that CORA substantially outperforms the previous state of the art on multilingual open QA benchmarks across 26 languages, 9 of which are unseen during training. Our analyses show the significance of cross-lingual retrieval and generation in many languages, particularly under low-resource settings.

翻译：我们提出跨语言开放检索问答(CORA)模式,这是第一个能够回答多种语言问题的统一的多到多种问题解答模式,即使是没有语言专用附加说明的数据或知识来源的解答模式。我们引入了一种新的密集通道检索算法,经过培训,可以跨语言检索文件。结合多语言自动递增生成模式,CORA直接以目标语言回答,而没有先前工作中使用的任何翻译或语言检索模块。我们建议了一种迭代培训方法,将仅以高资源语言提供的附加说明的数据自动扩展到低资源语言。我们的结果表明,CORA大大超越了26种语言的多语言开放问答基准,其中9种在培训期间是看不见的。我们的分析表明,多语言的检索和生成的意义,特别是在低资源环境下。

0

相关内容

自动问答

自动问答（Question Answering, QA）是指利用计算机自动回答用户所提出的问题以满足用户知识需求的任务。不同于现有搜索引擎，问答系统是信息服务的一种高级形式，系统返回用户的不再是基于关键词匹配排序的文档列表，而是精准的自然语言答案。近年来，随着人工智能的飞速发展，自动问答已经成为倍受关注且发展前景广泛的研究方向。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

EMNLP 2021 | RocketQAv2：稠密段落检索和段落精排的联合训练方法

EMNLP 2021 | RocketQAv2：稠密段落检索和段落精排的联合训练方法

专知会员服务

12+阅读 · 2021年10月24日

近期必读的七篇AAAI 2021【问答（QA）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月2日

超越三元组:基于超关系知识图谱嵌入的链接预测，Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction

专知会员服务

78+阅读 · 2020年5月11日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

所有跨语言嵌入式都应该讲英语吗? | Should All Cross-Lingual Embeddings Speak English?

所有跨语言嵌入式都应该讲英语吗? | Should All Cross-Lingual Embeddings Speak English?

专知会员服务

7+阅读 · 2020年4月16日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

专知会员服务

26+阅读 · 2020年2月10日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【资源】问答阅读理解资源列表

【资源】问答阅读理解资源列表

专知

3+阅读 · 2020年7月25日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

论文浅尝 | XQA：一个跨语言开放域问答数据集

论文浅尝 | XQA：一个跨语言开放域问答数据集

开放知识图谱

25+阅读 · 2019年9月11日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

最新NLP论文阅读列表，包括对话、问答、摘要、翻译、看图说话等

最新NLP论文阅读列表，包括对话、问答、摘要、翻译、看图说话等

专知

9+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

专知

7+阅读 · 2018年5月8日

Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

Arxiv

3+阅读 · 2021年6月11日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking

Arxiv

8+阅读 · 2020年3月3日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

Zero-Resource Cross-Lingual Named Entity Recognition

Arxiv

5+阅读 · 2019年11月22日

Unsupervised Cross-lingual Representation Learning at Scale

Arxiv

5+阅读 · 2019年11月5日

QA4IE: A Question Answering based Framework for Information Extraction

Arxiv

4+阅读 · 2019年1月28日

Learning to Rank Query Graphs for Complex Question Answering over Knowledge Graphs

Arxiv

4+阅读 · 2018年11月2日

Multi-Task Neural Models for Translating Between Styles Within and Across Languages

Arxiv

4+阅读 · 2018年6月12日

Fluency-Guided Cross-Lingual Image Captioning

Arxiv

3+阅读 · 2017年8月15日

VIP会员

文章信息

相关主题

自回归生成模型

数据可用性

相关VIP内容

EMNLP 2021 | RocketQAv2：稠密段落检索和段落精排的联合训练方法

EMNLP 2021 | RocketQAv2：稠密段落检索和段落精排的联合训练方法

专知会员服务

12+阅读 · 2021年10月24日

近期必读的七篇AAAI 2021【问答（QA）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月2日

超越三元组:基于超关系知识图谱嵌入的链接预测，Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction

专知会员服务

78+阅读 · 2020年5月11日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

所有跨语言嵌入式都应该讲英语吗? | Should All Cross-Lingual Embeddings Speak English?

所有跨语言嵌入式都应该讲英语吗? | Should All Cross-Lingual Embeddings Speak English?

专知会员服务

7+阅读 · 2020年4月16日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

专知会员服务

26+阅读 · 2020年2月10日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

通信行业：智能低空通感网络白皮书

3D形状生成：综述

6000字《伊朗-以色列战争解析：欺骗与信息战如何塑造公众认知》最新报告（附原文）

【博士论文】优化智能体工作流以提升信息获取效率

相关资讯

【资源】问答阅读理解资源列表

【资源】问答阅读理解资源列表

专知

3+阅读 · 2020年7月25日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

论文浅尝 | XQA：一个跨语言开放域问答数据集

论文浅尝 | XQA：一个跨语言开放域问答数据集

开放知识图谱

25+阅读 · 2019年9月11日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

最新NLP论文阅读列表，包括对话、问答、摘要、翻译、看图说话等

最新NLP论文阅读列表，包括对话、问答、摘要、翻译、看图说话等

专知

9+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

专知

7+阅读 · 2018年5月8日

相关论文

Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

Arxiv

3+阅读 · 2021年6月11日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking

Arxiv

8+阅读 · 2020年3月3日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

Zero-Resource Cross-Lingual Named Entity Recognition

Arxiv

5+阅读 · 2019年11月22日

Unsupervised Cross-lingual Representation Learning at Scale

Arxiv

5+阅读 · 2019年11月5日

QA4IE: A Question Answering based Framework for Information Extraction

Arxiv

4+阅读 · 2019年1月28日

Learning to Rank Query Graphs for Complex Question Answering over Knowledge Graphs

Arxiv

4+阅读 · 2018年11月2日

Multi-Task Neural Models for Translating Between Styles Within and Across Languages

Arxiv

4+阅读 · 2018年6月12日

Fluency-Guided Cross-Lingual Image Captioning

Arxiv

3+阅读 · 2017年8月15日

微信扫码咨询专知VIP会员