DBATEs:竞争性辩论中音频特征、文字和视觉表达方式数据库 (DBATES: DataBase of Audio features, Text, and visual Expressions in competitive debate Speeches)

Taylan K. Sen,Gazi Naven,Luke Gerstner,Daryl Bagley,Raiyan Abdul Baten,Wasifur Rahman,Kamrul Hasan,Kurtis G. Haut,Abdullah Mamun,Samiha Samrose,Anne Solbu,R. Eric Barnes,Mark G. Frank,Ehsan Hoque

from arxiv, 12 pages, 5 figures, 4 tables, under-going major revision for TAC

In this work, we present a database of multimodal communication features extracted from debate speeches in the 2019 North American Universities Debate Championships (NAUDC). Feature sets were extracted from the visual (facial expression, gaze, and head pose), audio (PRAAT), and textual (word sentiment and linguistic category) modalities of raw video recordings of competitive collegiate debaters (N=717 6-minute recordings from 140 unique debaters). Each speech has an associated competition debate score (range: 67-96) from expert judges as well as competitor demographic and per-round reflection surveys. We observe the fully multimodal model performs best in comparison to models trained on various compositions of modalities. We also find that the weights of some features (such as the expression of joy and the use of the word we) change in direction between the aforementioned models. We use these results to highlight the value of a multimodal dataset for studying competitive, collegiate debate.

翻译：在这项工作中,我们提供了一个从2019年北美大学辩论锦标赛(NAUDC)辩论演说中摘取的多式交流特点数据库,这些特点来自竞争性合议辩论者原始录像的视觉(显性表达、凝视和头部姿势)、音响(PRAAT)和文字(文字情绪和语言类别)模式(N=717,来自140名独特辩论者6分钟的录音),每场演讲都有相关的竞争辩论评分(范围从67-96不等),来自专家法官以及竞争者人口和全方位反省调查。我们观察到,完全多式联运模式在与各种模式构成培训模型相比方面表现最佳。我们还发现,一些特点(如快乐表达和使用我们这个词)的权重在上述模式之间有所变化。我们利用这些结果来强调多式数据集对于研究竞争性合议式辩论的价值。