Computer Vision (CV) classifiers which distinguish and detect nonverbal social human behavior and mental state can aid digital diagnostics and therapeutics for psychiatry and the behavioral sciences. While CV classifiers for traditional and structured classification tasks can be developed with standard machine learning pipelines for supervised learning consisting of data labeling, preprocessing, and training a convolutional neural network, there are several pain points which arise when attempting this process for behavioral phenotyping. Here, we discuss the challenges and corresponding opportunities in this space, including handling heterogeneous data, avoiding biased models, labeling massive and repetitive data sets, working with ambiguous or compound class labels, managing privacy concerns, creating appropriate representations, and personalizing models. We discuss current state-of-the-art research endeavors in CV such as data curation, data augmentation, crowdsourced labeling, active learning, reinforcement learning, generative models, representation learning, federated learning, and meta-learning. We highlight at least some of the machine learning advancements needed for imaging classifiers to detect human social cues successfully and reliably.
翻译:区分和检测非语言社会人类行为和精神状态的计算机视觉(CV)分类方法可以帮助为精神病学和行为科学提供数字诊断和治疗; 传统和结构化分类任务的CV分类方法可以与标准机器学习管道开发,用于监督学习,包括数据标签、预处理和培训进化神经网络,在尝试这一行为运动过程时会产生若干疼痛点; 在这里,我们讨论这个空间的挑战和相应机会,包括处理各种数据、避免偏差模型、标出大规模和重复的数据集、与模糊或复合类标签合作、管理隐私问题、创造适当的代表性和个性化模型。我们讨论目前CV中最先进的研究工作,如数据整理、数据增强、多源标签、积极学习、强化学习、基因化模型、代言学习、节制学习和元化学习。我们至少强调成像分类者成功和可靠地探测人类社会线索所需的机器学习进展。