Personality is crucial for understanding human internal and external states. The majority of existing personality computing approaches suffer from complex and dataset-specific pre-processing steps and model training tricks. In the absence of a standardized benchmark with consistent experimental settings, it is not only impossible to fairly compare the real performances of these personality computing models but also makes them difficult to be reproduced. In this paper, we present the first reproducible audio-visual benchmarking framework to provide a fair and consistent evaluation of eight existing personality computing models (e.g., audio, visual and audio-visual) and seven standard deep learning models on both self-reported and apparent personality recognition tasks. We conduct a comprehensive investigation into all the benchmarked models to demonstrate their capabilities in modelling personality traits on two publicly available datasets, audio-visual apparent personality (ChaLearn First Impression) and self-reported personality (UDIVA) datasets. The experimental results conclude: (i) apparent personality traits, inferred from facial behaviours by most benchmarked deep learning models, show more reliability than self-reported ones; (ii) visual models frequently achieved superior performances than audio models on personality recognition; and (iii) non-verbal behaviours contribute differently in predicting different personality traits. We make the code publicly available at https://github.com/liaorongfan/DeepPersonality .
翻译:现有人格计算方法大多存在复杂和特定数据集的预处理步骤和模型培训技巧。在没有统一的实验环境的标准化基准的情况下,我们对所有基准模型进行全面调查,以展示其在以下两种公开数据集上建模性格特征的能力:视听性格模型(ChaLearn First Impression)和自我报告的个性(UDIVA)数据集。实验结果得出结论:(一) 从最有基准的深层次学习模型的面部行为中推断出明显的个性特征特征特征,显示比自我报告和明显的个性识别任务更可靠;(二) 视觉模型经常取得优于视听性格特征模型;(三) 在可公开获得的磁性特征识别方面,我们提供不易变的磁性特征识别。