The automatic detection of gaze targets in autistic children through artificial intelligence can be impactful, especially for those who lack access to a sufficient number of professionals to improve their quality of life. This paper introduces a new, real-world AI application for gaze target detection in autistic children, which predicts a child's point of gaze from an activity image. This task is foundational for building automated systems that can measure joint attention-a core challenge in Autism Spectrum Disorder (ASD). To facilitate the study of this challenging application, we collected the first-ever Autism Gaze Target (AGT) dataset. We further propose a novel Socially Aware Coarse-to-Fine (SACF) gaze detection framework that explicitly leverages the social context of a scene to overcome the class imbalance common in autism datasets-a consequence of autistic children's tendency to show reduced gaze to faces. It utilizes a two-pathway architecture with expert models specialized in social and non-social gaze, guided by a context-awareness gate module. The results of our comprehensive experiments demonstrate that our framework achieves new state-of-the-art performance for gaze target detection in this population, significantly outperforming existing methods, especially on the critical minority class of face-directed gaze.
翻译:通过人工智能自动检测自闭症儿童的注视目标具有重要影响,尤其对于缺乏足够专业人员以改善其生活质量的群体而言。本文提出了一种新颖、面向真实场景的自闭症儿童注视目标检测人工智能应用,该应用可从活动图像中预测儿童的注视点。该任务是构建自动化系统的基石,用于测量共同注意力——这是自闭症谱系障碍(ASD)的核心挑战之一。为促进这一具有挑战性应用的研究,我们首次收集了自闭症注视目标(AGT)数据集。进一步,我们提出了一种新颖的社交感知粗到细(SACF)注视检测框架,该框架显式利用场景的社交上下文以克服自闭症数据集中常见的类别不平衡问题——这源于自闭症儿童倾向于减少对面部的注视。该框架采用双通路架构,配备专门处理社交与非社交注视的专家模型,并由上下文感知门控模块引导。综合实验结果表明,我们的框架在该人群的注视目标检测中实现了新的最先进性能,显著优于现有方法,尤其在关键少数类——面部定向注视上表现突出。