Audiograms are a particular type of line charts representing individuals' hearing level at various frequencies. They are used by audiologists to diagnose hearing loss, and further select and tune appropriate hearing aids for customers. There have been several projects such as Autoaudio that aim to accelerate this process through means of machine learning. But all existing models at their best can only detect audiograms in images and classify them into general categories. They are unable to extract hearing level information from detected audiograms by interpreting the marks, axis, and lines. To address this issue, we propose a Multi-stage Audiogram Interpretation Network (MAIN) that directly reads hearing level data from photos of audiograms. We also established Open Audiogram, an open dataset of audiogram images with annotations of marks and axes on which we trained and evaluated our proposed model. Experiments show that our model is feasible and reliable.
翻译:声频图是代表不同频率个人听力水平的线性图表的一种特殊类型。 听力学家使用它们来诊断听力损失, 并为客户进一步选择和调控适当的助听器。 有好几个项目, 如Autoaudio, 目的是通过机器学习加速这一进程。 但是所有现有模型最多只能探测图像中的声频, 并将其分为一般类别。 它们无法通过解释标记、 轴和线从检测到的声频中提取听力水平信息。 为了解决这个问题, 我们提议建立一个多阶段声频判读网络, 直接读听力水平数据, 从声频图照片中读取数据。 我们还建立了 Open 音频图, 以及带有标记和轴图解的声频图集数据集, 我们为此培训和评估了拟议模型。 实验表明我们的模型是可行和可靠的。