Sections are the building blocks of Wikipedia articles. They enhance readability and can be used as a structured entry point for creating and expanding articles. Structuring a new or already existing Wikipedia article with sections is a hard task for humans, especially for newcomers or less experienced editors, as it requires significant knowledge about how a well-written article looks for each possible topic. Inspired by this need, the present paper defines the problem of section recommendation for Wikipedia articles and proposes several approaches for tackling it. Our systems can help editors by recommending what sections to add to already existing or newly created Wikipedia articles. Our basic paradigm is to generate recommendations by sourcing sections from articles that are similar to the input article. We explore several ways of defining similarity for this purpose (based on topic modeling, collaborative filtering, and Wikipedia's category system). We use both automatic and human evaluation approaches for assessing the performance of our recommendation system, concluding that the category--based approach works best, achieving precision and recall at 10 of about 80\% in the crowdsourcing evaluation.
翻译:维基百科文章的构件是维基百科文章的构件。 它们可以提高可读性,并可以用作创建和扩展文章的结构化切入点。 构建一个新的或已有的维基百科文章,加上章节,对于人类来说是一项艰巨的任务,特别是对于新来者或经验较少的编辑来说,这要求大量了解一个写得好的文章对每一个可能的专题的外观。 受此需要的启发,本文件界定了维基百科文章的章节建议问题,并提出了解决这一问题的若干办法。 我们的系统可以通过推荐哪些章节可以添加已有或新创建的维基百科文章来帮助编辑。 我们的基本范例是通过从与投入文章类似的文章中获取章节产生建议。 我们探索了为此目的界定相似性的若干方法(基于主题建模、协作过滤和维基百科分类系统)。 我们使用自动和人评价方法来评估我们建议系统的业绩,得出的结论是,基于分类的方法最有效,在众包评估中达到10点的精确度和回顾大约80 ⁇ 。