多语言软件项目的可追踪性支持 (Traceability Support for Multi-Lingual Software Projects)

Software traceability establishes associations between diverse software artifacts such as requirements, design, code, and test cases. Due to the non-trivial costs of manually creating and maintaining links, many researchers have proposed automated approaches based on information retrieval techniques. However, many globally distributed software projects produce software artifacts written in two or more languages. The use of intermingled languages reduces the efficacy of automated tracing solutions. In this paper, we first analyze and discuss patterns of intermingled language use across multiple projects, and then evaluate several different tracing algorithms including the Vector Space Model (VSM), Latent Semantic Indexing (LSI), Latent Dirichlet Allocation (LDA), and various models that combine mono- and cross-lingual word embeddings with the Generative Vector Space Model (GVSM). Based on an analysis of 14 Chinese-English projects, our results show that best performance is achieved using mono-lingual word embeddings integrated into GVSM with machine translation as a preprocessing step.

翻译：由于人工创建和维护链接的非三重成本,许多研究人员提出了基于信息检索技术的自动化方法。然而,许多全球分布的软件项目生成了两种或多种语言的软件文物。使用混合语言降低了自动追踪解决方案的功效。在本文件中,我们首先分析和讨论多个项目混合使用语言的模式,然后评价若干不同的追踪算法,包括矢量空间模型、迟发语语语索引、利通狄里赫莱分配(LDA)以及将单词和跨语言词嵌入基因矢量空间模型(GVSM)的各种模型。根据对14个中英项目的分析,我们的结果显示,最佳的成绩是通过将单语词嵌入GVSM和机器翻译作为预处理步骤实现。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

62+阅读 · 2020年8月6日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日