As increasingly more software services have been published onto the Internet, it remains a significant challenge to recommend suitable services to facilitate scientific workflow composition. This paper proposes a novel NLP-inspired approach to recommending services throughout a workflow development process, based on incrementally learning latent service representation from workflow provenance. A workflow composition process is formalized as a step-wise, context-aware service generation procedure, which is mapped to next-word prediction in a natural language sentence. Historical service dependencies are extracted from workflow provenance to build and enrich a knowledge graph. Each path in the knowledge graph reflects a scenario in a data analytics experiment, which is analogous to a sentence in a conversation. All paths are thus formalized as composable service sequences and are mined, using various patterns, from the established knowledge graph to construct a corpus. Service embeddings are then learned by applying deep learning model from the NLP field. Extensive experiments on the real-world dataset demonstrate the effectiveness and efficiency of the approach.
翻译:由于越来越多的软件服务已在因特网上公布,建议适当的服务以促进科学工作流程构成,仍是一项重大挑战。本文件建议采用新的NLP启发式方法,在工作流程开发过程中,根据从工作流程出处逐步学习潜在的服务代表,在整个工作流程开发过程中建议服务。工作流程构成程序正式化为逐步的、符合背景的服务生成程序,该程序在自然语言句子中绘制成下一个词的预测。历史服务依赖从工作流程来源提取,以建立和丰富一个知识图。知识图中的每一个路径都反映了数据分析实验中的一种情景,类似于谈话中的句子。因此,所有路径都正规化为可比较的服务序列,并使用各种模式从既定的知识图中挖掘出来,以构建一个基本体。然后,通过应用从自然语言句子中深度学习模型学习服务嵌入学习。在真实世界数据集上进行的广泛实验显示了该方法的实效和效率。