Representative sampling appears rare in empirical software engineering research. Not all studies need representative samples, but a general lack of representative sampling undermines a scientific field. This article therefore reports a systematic review of the state of sampling in recent, high-quality software engineering research. The key findings are: (1) random sampling is rare; (2) sophisticated sampling strategies are very rare; (3) sampling, representativeness and randomness often appear misunderstood. These findings suggest that \textit{software engineering research has a generalizability crisis}. To address these problems, this paper synthesizes existing knowledge of sampling into a succinct primer and proposes extensive guidelines for improving the conduct, presentation and evaluation of sampling in software engineering research. It is further recommended that while researchers should strive for more representative samples, disparaging non-probability sampling is generally capricious and particularly misguided for predominately qualitative research.
翻译:代表性抽样在实证软件工程研究中似乎很少见,并非所有研究都需要具有代表性的样本,但普遍缺乏代表性抽样都破坏了科学领域,因此,本条报告对近期高质量软件工程研究中的抽样状况进行了系统审查,主要结论是:(1) 随机抽样是罕见的;(2) 精密的抽样战略非常罕见;(3) 取样、代表性和随机性往往被误解;这些结论表明, ktextit{软件工程研究存在普遍性危机}。为解决这些问题,本文件将现有的取样知识综合成一个简明的入门,并提出了广泛的准则,以改进软件工程研究中取样的进行、展示和评价;还建议,虽然研究人员应努力争取更具代表性的样本,但贬低性、非概率抽样通常具有可变性,而且对于主要的质量研究来说尤其具有误导性。