Despite the importance and pervasiveness of Wikipedia as one of the largest platforms for open knowledge, surprisingly little is known about how people navigate its content when seeking information. To bridge this gap, we present the first systematic large-scale analysis of how readers browse Wikipedia. Using billions of page requests from Wikipedia's server logs, we measure how readers reach articles, how they transition between articles, and how these patterns combine into more complex navigation paths. We find that navigation behavior is characterized by highly diverse structures. Although most navigation paths are shallow, comprising a single pageload, there is much variety, and the depth and shape of paths vary systematically with topic, device type, and time of day. We show that Wikipedia navigation paths commonly mesh with external pages as part of a larger online ecosystem, and we describe how naturally occurring navigation paths are distinct from targeted navigation in lab-based settings. Our results further suggest that navigation is abandoned when readers reach low-quality pages. Taken together, these insights contribute to a more systematic understanding of readers' information needs and allow for improving their experience on Wikipedia and the Web in general.
翻译:尽管维基百科作为开放知识的最大平台之一的重要性和广度,但令人惊讶的是,人们对人们如何在寻求信息时如何浏览其内容却知之甚少。为了缩小这一差距,我们首次对读者如何浏览维基百科进行了系统化的大规模分析。使用维基百科服务器日志上数十亿页的页面请求,我们测量读者如何阅读文章,他们如何在文章之间过渡,以及这些模式如何融合到更复杂的导航路径。我们发现导航行为具有高度多样化的结构特征。虽然大多数导航路径都是浅浅的,包括一个页面,但内容很多,路径的深度和形状也随主题、设备类型和时间而变化。我们显示维基百科导航路径通常与外部网页相近,作为更大的在线生态系统的一部分,我们描述了自然发生的导航路径如何有别于实验室环境中的目标导航。我们的结果进一步表明,当读者到达低质量网页时,导航就被抛弃了。这些洞见有助于更系统地了解读者的信息需求,并使得他们在维基百科和一般网络上的经验得以改进。