Readme in GitHub repositories serves as a preliminary source of information, and thus helps developers in understanding about the projects, for reuse or extension. Different types of contextual and structural content, which we refer to as categories of the content and features in the content respectively, are present in readme files, and could determine the extent of comprehension about project. Consequently, the structural and contextual aspects of the content could impact the project popularity. Studying the correlation between the content and project popularity could help in focusing on the aspects that could improve popularity, while designing the readme files. However, existing studies explore the categories of content and types of features in readme files, and do not explore their usefulness towards project popularity. Hence, we present an empirical study to understand correlation between readme file content and project popularity. We perform the study on 1950 readme files of public GitHub projects, spanning across ten programming languages, and observe that readme files in majority of the popular projects are well organised using lists and images, and comprise links to external sources. Also, repositories with readme files containing contribution guidelines and references were observed to be associated with higher popularity.
翻译:GitHub 库中的Readme 是一个初步的信息来源,有助于开发者了解项目,以便重新使用或扩展。不同的背景和结构内容类型,我们分别称之为内容和内容特点的类别,存在于阅读文档中,可以确定项目的理解程度。因此,内容的结构和背景方面可能会影响项目的受欢迎程度。研究内容和项目受欢迎程度的相互关系有助于在设计读米文件时注重能够提高受欢迎程度的方面。但是,现有的研究探索了读米文件中的内容和特征种类的类别,而不是探索其对项目受欢迎性的有用性。因此,我们提出了一个经验性研究,以了解读米文件内容和项目受欢迎程度之间的相互关系。我们于1950年开展了一项研究,涉及10种编程语言的GitHub 公共项目读米文件,并观察到大多数受欢迎项目中的读米文件都使用列表和图像进行妥善组织,并包括与外部来源的链接。此外,还发现含有含有捐款指南和参考资料的读米档案库与更受欢迎程度有关。