[Context] Study replication is essential for theory building and empirical validation. [Problem] Despite its empirical vocation, requirements engineering (RE) research has given limited attention to study replication, threatening thereby the ability to verify existing results and use previous research as a baseline. [Solution] In this perspective paper, we -- a group of experts in natural language processing (NLP) for RE -- reflect on the challenges for study replication in NLP for RE. Concretely: (i) we report on hands-on experiences of replication, (ii) we review the state-of-the-art and extract replication-relevant information, and (iii) we identify, through focus groups, challenges across two typical dimensions of replication: data annotation and tool reconstruction. NLP for RE is a research area that is suitable for study replication since it builds on automated tools which can be shared, and quantitative evaluation that enable direct comparisons between results. [Results] Replication is hampered by several factors, including the context specificity of the studies, the heterogeneity of the tasks involving NLP, the tasks' inherent hairiness, and, in turn, the heterogeneous reporting structure. To address these issues, we propose an ID card whose goal is to provide a structured summary of research papers, with an emphasis on replication-relevant information. [Contribution] We contribute in this study with: (i) a set of reflections on replication in NLP for RE, (ii) a set of recommendations for researchers in the field to increase their awareness on the topic, and (iii) an ID card that is intended to primarily foster replication, and can also be used in other contexts, e.g., for educational purposes. Practitioners will also benefit from the results since replications increase confidence on research findings.
翻译:“复制与可验证性在需求工程中的应用:基于自然语言处理的需求工程案例研究”
翻译后的摘要:
[背景] 研究的复制对理论建立和实证验证至关重要。[问题] 尽管需求工程(RE)研究具有实证性,但它对研究复制的关注有限,从而威胁验证现有结果和使用先前研究作为基线的能力。[解决方案] 在这个视角的论文中,我们(一组自然语言处理(NLP)领域的专家)反思了 NLP 领域在实现研究复制方面的挑战。具体来说:(i)我们报告了复制的实际经验, (ii) 我们回顾了现有技术并提取了与复制相关的信息, (iii) 我们通过重点小组确定了两个典型复制维度的挑战:数据注释和工具重构。NLP for RE 是一个适合进行研究复制的研究领域,因为它建立在可以共享的自动化工具和能够进行直接比较结果的定量评估之上。[结果] 复制受到多种因素的限制,包括研究的特定背景,涉及 NLP 的任务的异质性,任务固有的难点,以及随之而来的异质性报告结构。为了解决这些问题,我们提出了一个 ID 卡,旨在提供研究论文的结构化摘要,重点关注复制相关信息。[贡献] 我们通过本研究共提供了三个方面的贡献:(i)反思 NLP for RE 研究复制的一组见解;(ii)一组针对该领域的研究人员的建议,以增加他们对相关主题的意识;(iii)一张旨在主要促进复制的 ID 卡,可以用于其他方面,例如教育目的。实践者也将从结果中受益,因为复制提高了对研究结果的信心。