This paper explores automated face and facial landmark detection of neonates, which is an important first step in many video-based neonatal health applications, such as vital sign estimation, pain assessment, sleep-wake classification, and jaundice detection. Utilising three publicly available datasets of neonates in the clinical environment, 366 images (258 subjects) and 89 (66 subjects) were annotated for training and testing, respectively. Transfer learning was applied to two YOLO-based models, with input training images augmented with random horizontal flipping, photo-metric colour distortion, translation and scaling during each training epoch. Additionally, the re-orientation of input images and fusion of trained deep learning models was explored. Our proposed model based on YOLOv7Face outperformed existing methods with a mean average precision of 84.8% for face detection, and a normalised mean error of 0.072 for facial landmark detection. Overall, this will assist in the development of fully automated neonatal health assessment algorithms.
 翻译:本文探讨了新生儿的自动脸部和面部标志性检测,这是许多基于视频的新生儿健康应用中重要的第一步,例如关键标志估计、疼痛评估、睡眠觉分解和黄 ⁇ 检测。在临床环境中,利用三种公开提供的新生儿数据集,分别对366个图象(258个主题)和89个(66个主题)进行培训和测试做了附加说明。转移学习应用到两个基于YOLO的模型,输入培训图像在每次培训过程中都配有随机水平翻转、光度颜色扭曲、翻译和缩放。此外,还探索了输入图像的重新定位和经过培训的深层学习模型的融合。我们提议的基于YOLOv7Face的模型比现有方法高出了平均84.8%的表面检测平均精确度,为面部标志性检测的正常平均平均误差为0.072。总体而言,这将有助于开发完全自动化的新生儿健康评估算法。