The ACM Multimedia 2022 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the Vocalisations and Stuttering Sub-Challenges, a classification on human non-verbal vocalisations and speech has to be made; the Activity Sub-Challenge aims at beyond-audio human activity recognition from smartwatch sensor data; and in the Mosquitoes Sub-Challenge, mosquitoes need to be detected. We describe the Sub-Challenges, baseline feature extraction, and classifiers based on the usual ComPaRE and BoAW features, the auDeep toolkit, and deep feature extraction from pre-trained CNNs using the DeepSpectRum toolkit; in addition, we add end-to-end sequential modelling, and a log-mel-128-BNN.
翻译:ACM 2022多语言主义计算挑战首次在定义明确的条件下解决了研究竞赛中的四个不同问题:在Vocalizations 和 Stuting Sub-Challenges 中,必须对人类非语言发声和言语进行分类;活动次级挑战旨在从智能观察传感器数据中辨别出超音频人类活动;在Mosquitoes 子挑战中,需要检测蚊子。我们描述了基于通常的COMPARE和BoAW特征的子挑战、基线特征提取和分类器、auDep工具包,以及使用深思普特鲁姆工具包从经过训练的有线电视新闻网中提取的深度特征;此外,我们增加了端对端的连续建模,以及一个log-mel-128-BNN。