Implementing automated emotion recognition on mobile devices could provide an accessible diagnostic and therapeutic tool for those who struggle to recognize emotion, including children with developmental behavioral conditions such as autism. Although recent advances have been made in building more accurate emotion classifiers, existing models are too computationally expensive to be deployed on mobile devices. In this study, we optimized and profiled various machine learning models designed for inference on edge devices and were able to match previous state of the art results for emotion recognition on children. Our best model, a MobileNet-V2 network pre-trained on ImageNet, achieved 65.11% balanced accuracy and 64.19% F1-score on CAFE, while achieving a 45-millisecond inference latency on a Motorola Moto G6 phone. This balanced accuracy is only 1.79% less than the current state of the art for CAFE, which used a model that contains 26.62x more parameters and was unable to run on the Moto G6, even when fully optimized. This work validates that with specialized design and optimization techniques, machine learning models can become lightweight enough for deployment on mobile devices and still achieve high accuracies on difficult image classification tasks.
翻译:在移动设备上实施自动情感识别可以为那些难以识别情感的人,包括有自闭症等发育行为条件的儿童,提供一种无障碍的诊断和治疗工具。虽然最近在建立更准确的情感分类器方面有所进步,但现有的模型在计算上过于昂贵,无法在移动设备上部署。在这项研究中,我们优化和配置了各种机器学习模型,用于在边缘设备上推断,并能够匹配儿童情感识别的以往水平。我们的最佳模型,即在图像网上预先培训过的移动网络-V2网络,实现了65.11%的平衡精度和64.19%的F1核心,同时在Motorola Moto G6 手机上实现了45毫升的发拉长。这种平衡精度仅比CAFE的艺术现状低1.79%。 CAFE使用的模型包含26.62x更多的参数,即使完全优化,也无法运行Mto G6。这项工作证实,通过专门设计和优化技术,机器学习模型可以变得轻度,足以在移动设备上部署并仍然达到高清晰的图像分类任务。