Autism spectrum disorder (ASD) can be defined as a neurodevelopmental disorder that affects how children interact, communicate and socialize with others. This disorder can occur in a broad spectrum of symptoms, with varying effects and severity. While there is no permanent cure for ASD, early detection and proactive treatment can substantially improve the lives of many children. Current methods to accurately diagnose ASD are invasive, time-consuming, and tedious. They can also be subjective perspectives of a number of clinicians involved, including pediatricians, speech pathologists, psychologists, and psychiatrists. New technologies are rapidly emerging that include machine learning models using speech, computer vision from facial, retinal, and brain MRI images of patients to accurately and timely detect this disorder. Our research focuses on computational linguistics and machine learning using speech data from TalkBank, the world's largest spoken language database. We used data of both ASD and Typical Development (TD) in children from TalkBank to develop machine learning models to accurately predict ASD. More than 50 features were used from specifically two datasets in TalkBank to run our experiments using five different classifiers. Logistic Regression and Random Forest models were found to be the most effective for each of these two main datasets, with an accuracy of 0.75. These experiments confirm that while significant opportunities exist for improving the accuracy, machine learning models can reliably predict ASD status in children for effective diagnosis.
翻译:自闭症谱系障碍(ASD)可被定义为影响儿童如何与他人互动、沟通和社交的神经发育障碍(ASD),可被定义为影响儿童如何与他人互动、沟通和社交的神经发育障碍(ASD),这种障碍可能发生在一系列广泛的症状中,其影响和严重程度各不相同。虽然无法永久治愈自闭症,但早期发现和主动治疗可以大大改善许多儿童的生活。目前准确诊断自闭症障碍的方法是侵入性的、耗时的和乏味的。它们也可以是若干参与的临床医生的主观观点,包括儿科医生、言语病理学家、心理学家和精神病学家。新技术正在迅速出现,包括机器学习模型,这些模型包括使用言语、计算机的视觉、面部、视视线和脑部MRI的患者图像,可以准确和及时地检测这种障碍。我们的研究重点是计算语言学和机器学习,而TalBank是世界上最大的口头语言数据库。我们使用了Aread和典型发展的数据,Tal Bank儿童开发机器学习模型来准确预测ASD。在TalBank中发现了50多个特别的特征特征,用两个模型进行我们的实验,同时用这些模型进行重要的精确分析。