We introduce AVCAffe, the first Audio-Visual dataset consisting of Cognitive load and Affect attributes. We record AVCAffe by simulating remote work scenarios over a video-conferencing platform, where subjects collaborate to complete a number of cognitively engaging tasks. AVCAffe is the largest originally collected (not collected from the Internet) affective dataset in English language. We recruit 106 participants from 18 different countries of origin, spanning an age range of 18 to 57 years old, with a balanced male-female ratio. AVCAffe comprises a total of 108 hours of video, equivalent to more than 58,000 clips along with task-based self-reported ground truth labels for arousal, valence, and cognitive load attributes such as mental demand, temporal demand, effort, and a few others. We believe AVCAffe would be a challenging benchmark for the deep learning research community given the inherent difficulty of classifying affect and cognitive load in particular. Moreover, our dataset fills an existing timely gap by facilitating the creation of learning systems for better self-management of remote work meetings, and further study of hypotheses regarding the impact of remote work on cognitive load and affective states.
翻译:我们引入了AVCAffe,这是第一个由认知负荷和效果属性组成的视听数据集。我们通过在视频会议平台上模拟远程工作情景来记录AVCAffe,在视频会议平台上模拟远程工作情景,在视频会议平台上,主题协作完成一些认知接触任务。AVCAffe是最初以英语收集的(不是从互联网收集的)感知数据集的最大一部分。我们从18个不同原籍国征聘了106名参与者,年龄在18至57岁之间,男女比例平衡。AVCAffe总共包含108小时的视频,相当于58 000多段剪片,同时制作基于任务的自报地面真相标签,用于刺激、价值和认知负荷属性,如精神需求、时间需求、努力和其他几个。我们认为,AVCAFAffe将是一个具有挑战性的基准,因为对影响进行分类的固有困难,特别是认知负荷。此外,我们的数据集填补了现有的及时空白,它有助于建立学习系统,以便更好地自我管理远程工作会议,并进一步研究远程工作对各州的影响。