IR-BERT: 将 BERT 用于新闻文章背景链接中的语义搜索 (IR-BERT: Leveraging BERT for Semantic Search in Background Linking for News Articles) - 专知论文

会员服务 ·

0

可理解性 · Nils Reimers · 基于上下文的表示 · 多样性度量/差异性度量 · 多样性 ·

2020 年 7 月 24 日

IR-BERT: Leveraging BERT for Semantic Search in Background Linking for News Articles

翻译：IR-BERT: 将 BERT 用于新闻文章背景链接中的语义搜索

Anup Anand Deshmukh,Udhav Sethi

from arxiv, 6 pages, 6 figures

This work describes our two approaches for the background linking task of TREC 2020 News Track. The main objective of this task is to recommend a list of relevant articles that the reader should refer to in order to understand the context and gain background information of the query article. Our first approach focuses on building an effective search query by combining weighted keywords extracted from the query document and uses BM25 for retrieval. The second approach leverages the capability of SBERT (Nils Reimers et al.) to learn contextual representations of the query in order to perform semantic search over the corpus. We empirically show that employing a language model benefits our approach in understanding the context as well as the background of the query article. The proposed approaches are evaluated on the TREC 2018 Washington Post dataset and our best model outperforms the TREC median as well as the highest scoring model of 2018 in terms of the nDCG@5 metric. We further propose a diversity measure to evaluate the effectiveness of the various approaches in retrieving a diverse set of documents. This would potentially motivate researchers to work on introducing diversity in their recommended list. We have open sourced our implementation on Github and plan to submit our runs for the background linking task in TREC 2020.

翻译：这项工作描述了我们为TREC 2020 New Trace的背景联系任务而采用的两种方法。这一任务的主要目的是建议一份读者应当参考的有关文章清单,以便理解背景和获取查询文章的背景资料。我们的第一个方法侧重于通过将查询文件的加权关键词合并来建立有效的搜索查询,并使用BM25 进行检索。第二个方法利用SBERT(Nils Reimers等人)的能力来了解查询的背景描述,以便进行对文体的语义搜索。我们的经验显示,使用一种语言模型有利于我们了解背景和查询文章的背景。在TREC 2018 华盛顿邮报数据集上评估了拟议方法,我们的最佳模型超越了查询文件的中位数,以及2018年NDCG@5衡量的最高评分模型。我们进一步提议了一种多样性措施,以评价各种方法在检索各种文件方面的有效性。这可能会激励研究人员在推荐的清单中介绍多样性的工作。我们已公开地将2020年的Github任务执行计划与我们的背景链接。

0

相关内容

可理解性

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【BERT改进ElasticSearch的相关准确率80%以上】NBoost: Boost Elasticsearch Search Relevance by 80% with BERT

【BERT改进ElasticSearch的相关准确率80%以上】NBoost: Boost Elasticsearch Search Relevance by 80% with BERT

专知会员服务

13+阅读 · 2019年11月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

156+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

已删除

将门创投

4+阅读 · 2018年6月1日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Beyond Lexical: A Semantic Retrieval Framework for Textual SearchEngine

Beyond Lexical: A Semantic Retrieval Framework for Textual SearchEngine

Arxiv

16+阅读 · 2020年8月10日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

Arxiv

6+阅读 · 2019年9月26日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

5+阅读 · 2019年5月14日

Improving Information Extraction from Images with Learned Semantic Models

Improving Information Extraction from Images with Learned Semantic Models

Arxiv

6+阅读 · 2018年8月27日

Learning Semantic Sentence Embeddings using Pair-wise Discriminator

Arxiv

6+阅读 · 2018年6月15日

Mixing Context Granularities for Improved Entity Linking on Question Answering Data across Entity Categories

Arxiv

3+阅读 · 2018年4月23日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

20+阅读 · 2018年1月16日

SAR: Semantic Analysis for Recommendation

Arxiv

6+阅读 · 2017年12月2日

VIP会员

文章信息

相关主题

基于上下文的表示

多样性度量/差异性度量

相关VIP内容

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【BERT改进ElasticSearch的相关准确率80%以上】NBoost: Boost Elasticsearch Search Relevance by 80% with BERT

【BERT改进ElasticSearch的相关准确率80%以上】NBoost: Boost Elasticsearch Search Relevance by 80% with BERT

专知会员服务

13+阅读 · 2019年11月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

156+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【CVPR2025】MixerMDM：可学习的人体运动扩散模型组合

【MIT博士论文】通过神经物理构建世界模型

AI教育的落地深度研究：复盘、对比和商业化

大规模推理模型的高效推理：综述

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

已删除

将门创投

4+阅读 · 2018年6月1日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Beyond Lexical: A Semantic Retrieval Framework for Textual SearchEngine

Beyond Lexical: A Semantic Retrieval Framework for Textual SearchEngine

Arxiv

16+阅读 · 2020年8月10日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

Arxiv

6+阅读 · 2019年9月26日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

5+阅读 · 2019年5月14日

Improving Information Extraction from Images with Learned Semantic Models

Improving Information Extraction from Images with Learned Semantic Models

Arxiv

6+阅读 · 2018年8月27日

Learning Semantic Sentence Embeddings using Pair-wise Discriminator

Arxiv

6+阅读 · 2018年6月15日

Mixing Context Granularities for Improved Entity Linking on Question Answering Data across Entity Categories

Arxiv

3+阅读 · 2018年4月23日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

20+阅读 · 2018年1月16日

SAR: Semantic Analysis for Recommendation

Arxiv

6+阅读 · 2017年12月2日

微信扫码咨询专知VIP会员