全文本和 URL 搜索网页档案 (Full-Text and URL Search Over Web Archives) - 专知论文

会员服务 ·

0

INFORMS · URL · BASIC · Engineering · 值域 ·

2021 年 8 月 3 日

Full-Text and URL Search Over Web Archives

翻译：全文本和 URL 搜索网页档案

from arxiv, Preprint

Web archives are a historically valuable source of information. In some respects, web archives are the only record of the evolution of human society in the last two decades. They preserve a mix of personal and collective memories, the importance of which tends to grow as they age. However, the value of web archives depends on their users being able to search and access the information they require in efficient and effective ways. Without the possibility of exploring and exploiting the archived contents, web archives are useless. Web archive access functionalities range from basic browsing to advanced search and analytical services, accessed through user-friendly interfaces. Full-text and URL search have become the predominant and preferred forms of information discovery in web archives, fulfilling user needs and supporting search APIs that feed complex applications. Both full-text and URL search are based on the technology developed for modern web search engines, since the Web is the main resource targeted by both systems. However, while web search engines enable searching over the most recent web snapshot, web archives enable searching over multiple snapshots from the past. This means that web archives have to deal with a temporal dimension that is the cause of new challenges and opportunities, discussed throughout this chapter.

翻译：网络档案是历史上宝贵的信息来源。在某些方面,网络档案是人类社会在过去二十年中演变的唯一记录,保存个人和集体的记忆,其重要性随着年老而日益增长。然而,网络档案的价值取决于其用户能否以高效和有效的方式搜索和访问他们所需要的信息。如果无法探索和利用存档内容,网络档案是没有用处的。网络档案访问功能从基本的浏览到先进的搜索和分析服务,通过方便用户的界面访问。全文和URL搜索已成为网络档案中信息发现的主要和首选形式,满足用户的需求,支持提供复杂应用程序的搜索API。全文和URL搜索都基于为现代网络搜索引擎开发的技术,因为网络是两个系统的主要目标资源。虽然网络搜索引擎能够搜索最新的网络快照,但网络档案能够从以往的多处搜索。这意味着网络档案必须处理作为新挑战和机会的时空因素,在整个章节中加以讨论。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【RecSys 2019报告】用于旅游业的推荐系统（Building Useful Recommender Systems for Tourists）

【RecSys 2019报告】用于旅游业的推荐系统（Building Useful Recommender Systems for Tourists）

专知会员服务

32+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Context-Aware Unsupervised Clustering for Person Search

Arxiv

0+阅读 · 2021年10月4日

Promoting Open Science Through Research Data Management

Arxiv

0+阅读 · 2021年10月2日

Library of Congress Subject Heading (LCSH) Browsing and Natural Language Searching

Arxiv

0+阅读 · 2021年9月30日

Accelerating COVID-19 research with graph mining and transformer-based learning

Arxiv

0+阅读 · 2021年9月29日

Future-Aware Diverse Trends Framework for Recommendation

Arxiv

3+阅读 · 2020年11月1日

Learning to Personalize for Web Search Sessions

Arxiv

7+阅读 · 2020年9月17日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

DARTS: Differentiable Architecture Search

Arxiv

3+阅读 · 2018年6月24日

Current Challenges and Visions in Music Recommender Systems Research

Arxiv

7+阅读 · 2018年3月21日

A Survey on Dialogue Systems: Recent Advances and New Frontiers

Arxiv

11+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

相关VIP内容

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【RecSys 2019报告】用于旅游业的推荐系统（Building Useful Recommender Systems for Tourists）

【RecSys 2019报告】用于旅游业的推荐系统（Building Useful Recommender Systems for Tourists）

专知会员服务

32+阅读 · 2019年9月19日

热门VIP内容

开通专知VIP会员享更多权益服务

《代码、指挥与冲突：描绘军事人工智能的未来》报告

【斯坦福博士论文】面向地理空间数据的多模态与多尺度建模：时空生成式人工智能

美国启动“自有军事人工智能计划”：采用谷歌Gemini以推动全军人工智能应用

《创新与适应性作为军事成功的关键因素：来自俄乌战争的战略洞见》报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Context-Aware Unsupervised Clustering for Person Search

Arxiv

0+阅读 · 2021年10月4日

Promoting Open Science Through Research Data Management

Arxiv

0+阅读 · 2021年10月2日

Library of Congress Subject Heading (LCSH) Browsing and Natural Language Searching

Arxiv

0+阅读 · 2021年9月30日

Accelerating COVID-19 research with graph mining and transformer-based learning

Arxiv

0+阅读 · 2021年9月29日

Future-Aware Diverse Trends Framework for Recommendation

Arxiv

3+阅读 · 2020年11月1日

Learning to Personalize for Web Search Sessions

Arxiv

7+阅读 · 2020年9月17日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

DARTS: Differentiable Architecture Search

Arxiv

3+阅读 · 2018年6月24日

Current Challenges and Visions in Music Recommender Systems Research

Arxiv

7+阅读 · 2018年3月21日

A Survey on Dialogue Systems: Recent Advances and New Frontiers

Arxiv

11+阅读 · 2018年1月11日

微信扫码咨询专知VIP会员