Identifying pills given their captured images under various conditions and backgrounds has been becoming more and more essential. Several efforts have been devoted to utilizing the deep learning-based approach to tackle the pill recognition problem in the literature. However, due to the high similarity between pills' appearance, misrecognition often occurs, leaving pill recognition a challenge. To this end, in this paper, we introduce a novel approach named PIKA that leverages external knowledge to enhance pill recognition accuracy. Specifically, we address a practical scenario (which we call contextual pill recognition), aiming to identify pills in a picture of a patient's pill intake. Firstly, we propose a novel method for modeling the implicit association between pills in the presence of an external data source, in this case, prescriptions. Secondly, we present a walk-based graph embedding model that transforms from the graph space to vector space and extracts condensed relational features of the pills. Thirdly, a final framework is provided that leverages both image-based visual and graph-based relational features to accomplish the pill identification task. Within this framework, the visual representation of each pill is mapped to the graph embedding space, which is then used to execute attention over the graph representation, resulting in a semantically-rich context vector that aids in the final classification. To our knowledge, this is the first study to use external prescription data to establish associations between medicines and to classify them using this aiding information. The architecture of PIKA is lightweight and has the flexibility to incorporate into any recognition backbones. The experimental results show that by leveraging the external knowledge graph, PIKA can improve the recognition accuracy from 4.8% to 34.1% in terms of F1-score, compared to baselines.
翻译:在各种条件和背景下,确定避孕药的拍摄图像越来越重要。一些努力已致力于利用深层次的学习方法来解决文献中的避孕药识别问题。然而,由于药丸外观高度相似,往往会发生误认,使药片识别成为挑战。为此,我们在本文件中采用了名为PIKA的新颖方法,利用外部知识提高药片识别准确度。具体地说,我们处理一种实用方案(我们称之为背景药片识别),目的是在患者药片摄入量的图片中识别药片。首先,我们提出了一种新型方法,用于模拟药片在外部数据源(在本案中为处方药)的准确度之间隐含的关联。第二,我们提出了一个基于行走图嵌嵌入模型,从图形空间转换到病媒空间,并提取药片片的精密关系特征。第三,我们提供了一种最终框架,利用图像的视觉和基于图形的关联性特征完成药片识别任务。在这个框架内,每种药片的直观展示都被映到图形嵌嵌入环境背景空间,然后将处方法文本的精度进行比较比较,然后用来将实验性定义, 将实验性数据转换用于分析中的数据序列中, 。在图表结构中,通过统计中, 将研究中, 建立对外部数据结构中, 建立对结果进行观察,从而建立对外部数据结构的识别, 建立对结果的识别,从而建立对外部数据结构的识别数据结构进行观察, 建立对结果的识别, 建立对结果的识别, 。在外部数据结构中, 建立对结果的识别学中, 建立对结果学中, 建立对结果的识别。在外向中,将数据结构学中,将数据结构学中,将数据的识别。在外向中,将数据的识别, 。在外向中,将数据结构学中,将数据结构学中,将数据结构学中进行中, 进行中,将数据结构学中,将数据结构进行中,将数据转换为对结果学中,将数据结构的识别,将数据结构中,将数据结构中,将数据结构中,将数据转换为对结果学中,将数据的识别,将数据结构中,将数据转换。