Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. We propose a solution for automatically semantifying biological assays. Our solution juxtaposes the problem of automated semantification as classification versus clustering where the two methods are on opposite ends of the method complexity spectrum. Characteristically modeling our problem, we find the clustering solution significantly outperforms a deep neural network state-of-the-art classification approach. This novel contribution is based on two factors: 1) a learning objective closely modeled after the data outperforms an alternative approach with sophisticated semantic modeling; 2) automatically semantifying biological assays achieves a high performance F1 of nearly 83%, which to our knowledge is the first reported standardized evaluation of the task offering a strong benchmark model.
翻译:生物数据和知识基础日益依赖语义网络技术和知识图解用于数据整合、检索和联合查询。我们提出了自动解密生物实验的解决方案。我们的解决方案将自动解密问题作为分类和组群的双重方法在方法复杂频谱的对面同时存在。典型的模拟我们的问题,我们发现集成解决方案大大优于深层神经网络的最新分类方法。这种新颖的贡献基于两个因素:(1) 一种学习目标,在数据形成后,经过密切建模,采用复杂的语义模型,采用不同于另一种方法的替代方法;(2) 自动解密生物实验,达到高达83%的性能F1, 据我们所知,这是首次报告对提供强力基准模型的任务进行标准化评估。