知识库(Knowledge Base)是知识工程中结构化,易操作,易利用,全面有组织的知识集群,是针对某一(或某些)领域问题求解的需要,采用某种(或若干)知识表示方式在计算 机存储器中 存储、组织、管理和使用的互相联系的知识片集合。这些知识片包括与领域相关的理论知识、事实数据,由专家经验得到的启发式知识,如某领域内有关的定义、定 理和运算法则以及常识性知识等。

VIP内容

XLM-K:通过多语言知识库提高跨语言预训练模型

XLM-K: Improving Cross-Lingual Language Model Pre-Training with Multilingual Knowledge

https://www.zhuanzhi.ai/paper/f50b1d5ba3d41d06328348865c1549ea

论文摘要:

跨语言预训练的目标是提高模型在语言之间的迁移能力,使模型可以在一种语言上训练,然后在其他语言上直接测试。之前跨语言模型的能力主要来源于单语和双语的普通文本。我们的工作首次提出从多语言的知识库中来学习跨语言能力。我们提出了两个新的预训练任务:掩码实体预测和客体推理。这两个任务可以帮助模型实现更好的跨语言对齐,以及让模型更好的记忆知识。在具体任务上的测试表明了我们的模型可以显著提高知识相关的任务的性能,知识探针任务证明了我们模型更好的记忆了知识库。

论文开创性突破与核心贡献:利用结构化的多语言知识库来提升预训练模型,让模型通过掌握知识来提升跨语言迁移能力。

成为VIP会员查看完整内容
1
16

最新内容

KBQA is a task that requires to answer questions by using semantic structured information in knowledge base. Previous work in this area has been restricted due to the lack of large semantic parsing dataset and the exponential growth of searching space with the increasing hops of relation paths. In this paper, we propose an efficient pipeline method equipped with a pre-trained language model. By adopting Beam Search algorithm, the searching space will not be restricted in subgraph of 3 hops. Besides, we propose a data generation strategy, which enables our model to generalize well from few training samples. We evaluate our model on an open-domain complex Chinese Question Answering task CCKS2019 and achieve F1-score of 62.55% on the test dataset. In addition, in order to test the few-shot learning capability of our model, we ramdomly select 10% of the primary data to train our model, the result shows that our model can still achieves F1-score of 58.54%, which verifies the capability of our model to process KBQA task and the advantage in few-shot Learning.

0
0
下载
预览

最新论文

KBQA is a task that requires to answer questions by using semantic structured information in knowledge base. Previous work in this area has been restricted due to the lack of large semantic parsing dataset and the exponential growth of searching space with the increasing hops of relation paths. In this paper, we propose an efficient pipeline method equipped with a pre-trained language model. By adopting Beam Search algorithm, the searching space will not be restricted in subgraph of 3 hops. Besides, we propose a data generation strategy, which enables our model to generalize well from few training samples. We evaluate our model on an open-domain complex Chinese Question Answering task CCKS2019 and achieve F1-score of 62.55% on the test dataset. In addition, in order to test the few-shot learning capability of our model, we ramdomly select 10% of the primary data to train our model, the result shows that our model can still achieves F1-score of 58.54%, which verifies the capability of our model to process KBQA task and the advantage in few-shot Learning.

0
0
下载
预览
参考链接
父主题
子主题
Top
微信扫码咨询专知VIP会员