Several methods have been explored for automating parts of Systematic Mapping (SM) and Systematic Review (SR) methodologies. Challenges typically evolve around the gaps in semantic understanding of text, as well as lack of domain and background knowledge necessary to bridge that gap. In this paper we investigate possible ways of automating parts of the SM/SR process, i.e. that of extracting keywords and key-phrases from scientific documents using unsupervised methods, which are then used as a basis to construct the corresponding Classification Scheme using semantic key-phrase clustering techniques. Specifically, we explore the effect of ensemble scores measure in key-phrase extraction, we explore semantic network based word embedding in embedding representation of phrase semantics and finally we also explore how clustering can be used to group related key-phrases. The evaluation is conducted on a dataset of publications pertaining the domain of "Explainable AI" which we constructed using standard publicly available digital libraries and sets of indexing terms (keywords). Results shows that: ensemble ranking score does improve the key-phrase extraction performance. Semantic-network based word embedding based on the ConceptNet Semantic Network has similar performance with contextualized word embedding, however the former are computationally more efficient. Finally Semantic key-phrase clustering at term-level can group similar terms together that can be suitable for classification scheme.
翻译:探索了多种方法,将系统绘图和系统审查方法的部分内容自动化。挑战通常围绕文字语义理解方面的差距,以及缺乏弥合这一差距所必需的域和背景知识而演变。在本文件中,我们调查了SM/SR进程部分内容自动化的可能方法,即利用不受监督的方法从科学文件中提取关键词和关键词,然后将这些关键词和关键词句用作使用语义关键词组合技术构建相应的分类办法的基础。具体地说,我们探索关键词组合组合组合方法在关键词组合中的效果,我们探索基于语义提取的共通分数计量方法,在嵌入语义表达中嵌入基于语义的词组和背景知识以弥合这一差距的必要知识。我们研究的是,如何将组合用于相关关键词组的组合,即:利用标准公开的数字图书馆和成套索引术语(关键词组)来构建相应的分类制度。结果显示:Semanticle分级评分确实改进了关键词提取等级的绩效。Seman-netbrouple 嵌入了基于更相近的Seman-commal commational lavelding lading seal-s