Despite the importance and pervasiveness of Wikipedia as one of the largest platforms for open knowledge, surprisingly little is known about how people navigate its content when seeking information. To bridge this gap, we present the first systematic large-scale analysis of how readers browse Wikipedia. Using billions of page requests from Wikipedia's server logs, we measure how readers reach articles, how they transition between articles, and how these patterns combine into more complex navigation paths. We find that navigation behavior is characterized by highly diverse structures. Although most navigation paths are shallow, comprising a single pageload, there is much variety, and the depth and shape of paths vary systematically with topic, device type, and time of day. We show that Wikipedia navigation paths commonly mesh with external pages as part of a larger online ecosystem, and we describe how naturally occurring navigation paths are distinct from targeted navigation in lab-based settings. Our results further suggest that navigation is abandoned when readers reach low-quality pages. These findings not only help in identifying potential improvements to reader experience on Wikipedia, but also in better understanding of how people seek knowledge on the Web.
翻译:尽管维基百科作为开放知识的最大平台之一的重要性和广度,但令人惊讶的是,人们对人们如何在寻求信息时如何浏览其内容却知之甚少。为了缩小这一差距,我们首次对读者如何浏览维基百科进行了系统化的大规模分析。我们利用维基百科服务器日志上数十亿页的页面请求,衡量读者如何阅读文章,他们如何在文章之间转换,以及这些模式如何融合到更复杂的导航路径。我们发现导航行为具有高度多样化的结构特征。虽然大多数导航路径都是浅浅的,包括一个单页,但内容种类繁多,路径的深度和形状随主题、设备类型和时间而变化。我们显示维基百科导航路径通常与外部网页相近,作为更大的在线生态系统的一部分,我们描述了自然发生的导航路径如何与实验室环境中的目标导航不同。我们的结果进一步表明,当读者到达低质量网页时,导航就被抛弃了。这些结果不仅有助于确定读者在维基百科方面的经验方面的潜在改进,而且还有助于更好地了解人们如何在网上寻求知识。