Hand hygiene is a standard six-step hand-washing action proposed by the World Health Organization (WHO). However, there is no good way to supervise medical staff to do hand hygiene, which brings the potential risk of disease spread. Existing action assessment works usually make an overall quality prediction on an entire video. However, the internal structures of hand hygiene action are important in hand hygiene assessment. Therefore, we propose a novel fine-grained learning framework to perform step segmentation and key action scorer in a joint manner for accurate hand hygiene assessment. Existing temporal segmentation methods usually employ multi-stage convolutional network to improve the segmentation robustness, but easily lead to over-segmentation due to the lack of the long-range dependence. To address this issue, we design a multi-stage convolution-transformer network for step segmentation. Based on the observation that each hand-washing step involves several key actions which determine the hand-washing quality, we design a set of key action scorers to evaluate the quality of key actions in each step. In addition, there lacks a unified dataset in hand hygiene assessment. Therefore, under the supervision of medical staff, we contribute a video dataset that contains 300 video sequences with fine-grained annotations. Extensive experiments on the dataset suggest that our method well assesses hand hygiene videos and achieves outstanding performance.
翻译:卫生是世界卫生组织(卫生组织)建议的一个标准的六步洗手行动。然而,目前没有良好的方法监督医务人员进行卫生工作,这可能导致疾病传播的潜在风险。现有的行动评估工作通常对整个视频进行总体质量预测。然而,手卫生行动的内部结构在手卫生评估中很重要。因此,我们提议了一个新的精细的学习框架,以联合方式进行步骤分解和关键行动评分,以准确的手卫生评估。现有的时间分解方法通常采用多阶段共变网络,以提高分解的稳健性,但容易导致分解过度,因为缺乏长距离依赖性。为了解决这个问题,我们设计了一个多阶段的分解转基因网络用于分解。根据观察,每个手洗手步骤都涉及确定洗手质量的若干关键行动评分器,我们设计了一套关键的行动评分器,以评价每个步骤的关键行动评分器的质量。此外,由于缺乏长期依赖性,我们手卫生评估缺乏统一的数据集。因此,在监督下,我们设计了一个多阶段的分解转换网络,我们设计了一个跨阶段的网络用于分解分解分解。基于医疗工作人员,我们出色的数据评估的顺序,我们的文件评析方法,我们提供了一套完整的数据。