The Medical Subject Headings (MeSH) thesaurus is a controlled vocabulary widely used in biomedical knowledge systems, particularly for semantic indexing of scientific literature. As the MeSH hierarchy evolves through annual version updates, some new descriptors are introduced that were not previously available. This paper explores the conceptual provenance of these new descriptors. In particular, we investigate whether such new descriptors have been previously covered by older descriptors and what is their current relation to them. To this end, we propose a framework to categorize new descriptors based on their current relation to older descriptors. Based on the proposed classification scheme, we quantify, analyse and present the different types of new descriptors introduced in MeSH during the last fifteen years. The results show that only about 25% of new MeSH descriptors correspond to new emerging concepts, whereas the rest were previously covered by one or more existing descriptors, either implicitly or explicitly. Most of them were covered by a single existing descriptor and they usually end up as descendants of it in the current hierarchy, gradually leading towards a more fine-grained MeSH vocabulary. These insights about the dynamics of the thesaurus are useful for the retrospective study of scientific articles annotated with MeSH, but could also be used to inform the policy of updating the thesaurus in the future.
翻译:医学主题标题(MESH)术语词库(MesH)是生物医学知识体系中广泛使用的一种受控制的词汇,特别是用于科学文献的语义索引。随着MESH等级通过年度版本更新不断演变,引入了一些以前没有的新的描述词。本文探讨了这些新的描述词的概念来源。特别是,我们调查这些新的描述词以前是否由较老的描述词(MesH)所覆盖,以及它们目前与它们的关系是什么。为此,我们提议了一个框架,根据生物医学知识体系中目前与较老的描述词的关系,对新的描述词进行分类。根据拟议的分类办法,我们量化、分析和介绍过去15年中在MesHH引入的不同类型的新描述词组。结果显示,只有大约25%的新描述词与新的概念相对应,而以前只是由一个或更多的现有描述词(含蓄或明确)所覆盖的描述词库所覆盖。其中多数由单一的描述词组所覆盖,它们通常在目前的等级体系中成为其后代。我们量化、分析、分析、并逐渐导致未来对MSH系统进行更新科学动态的精确分析。