在与自闭症有关行为儿童中基于愿景的活动承认 (Vision-Based Activity Recognition in Children with Autism-Related Behaviors)

Advances in machine learning and contactless sensors have enabled the understanding complex human behaviors in a healthcare setting. In particular, several deep learning systems have been introduced to enable comprehensive analysis of neuro-developmental conditions such as Autism Spectrum Disorder (ASD). This condition affects children from their early developmental stages onwards, and diagnosis relies entirely on observing the child's behavior and detecting behavioral cues. However, the diagnosis process is time-consuming as it requires long-term behavior observation, and the scarce availability of specialists. We demonstrate the effect of a region-based computer vision system to help clinicians and parents analyze a child's behavior. For this purpose, we adopt and enhance a dataset for analyzing autism-related actions using videos of children captured in uncontrolled environments (e.g. videos collected with consumer-grade cameras, in varied environments). The data is pre-processed by detecting the target child in the video to reduce the impact of background noise. Motivated by the effectiveness of temporal convolutional models, we propose both light-weight and conventional models capable of extracting action features from video frames and classifying autism-related behaviors by analyzing the relationships between frames in a video. Through extensive evaluations on the feature extraction and learning strategies, we demonstrate that the best performance is achieved with an Inflated 3D Convnet and Multi-Stage Temporal Convolutional Networks, achieving a 0.83 Weighted F1-score for classification of the three autism-related actions, outperforming existing methods. We also propose a light-weight solution by employing the ESNet backbone within the same system, achieving competitive results of 0.71 Weighted F1-score, and enabling potential deployment on embedded systems.

翻译：机器学习和无接触传感器的进步使得人们能够理解保健环境中复杂的人类行为,特别是引入了几个深层次的学习系统,以便能够全面分析神经发育条件,如自闭症谱谱障碍(ASD)等。这种疾病从儿童早期发育阶段起就对儿童产生影响,诊断完全依赖观察儿童的行为和探测行为提示。然而,诊断过程耗费时间,因为它需要长期的行为观察,而且缺乏专家。我们展示了以区域为基础的计算机视觉系统的影响,以帮助临床医生和家长分析儿童的行为。为此,我们采用和加强一套数据集,利用在不受控制的环境中摄取的儿童视频分析自闭症相关行动(例如,通过消费者级相机收集的视频,在不同环境中收集的视频)来分析儿童从早期发育阶段开始,而诊断完全依靠在视频中检测目标儿童的行为,以减少背景噪音的影响。受时间演进模型的效力驱动,我们提议通过从视频框架中提取行动特征的光量和常规模型,对自闭式行为进行分类。我们还通过在视频网络内部分析采用最精确的运行方式,通过在视频模型中展示一种最精确的运行方法,我们获得了了最精确的磁力1 。