One of the key communicative competencies is the ability to maintain fluency in monologic speech and the ability to produce sophisticated language to argue a position convincingly. In this paper we aim to predict TED talk-style affective ratings in a crowdsourced dataset of argumentative speech consisting of 7 hours of speech from 110 individuals. The speech samples were elicited through task prompts relating to three debating topics. The samples received a total of 2211 ratings from 737 human raters pertaining to 14 affective categories. We present an effective approach to the classification task of predicting these categories through fine-tuning a model pre-trained on a large dataset of TED talks public speeches. We use a combination of fluency features derived from a state-of-the-art automatic speech recognition system and a large set of human-interpretable linguistic features obtained from an automatic text analysis system. Classification accuracy was greater than 60% for all 14 rating categories, with a peak performance of 72% for the rating category 'informative'. In a secondary experiment, we determined the relative importance of features from different groups using SP-LIME.
翻译:关键的交流能力之一是能够保持单一语言的流畅性,并能够制作精密的语言来令人信服地论证立场。在本文中,我们的目标是在由110名个人7小时发言组成的多源辩论演讲数据集中预测TED口语式的感性评分。通过与三个辩论专题有关的任务提示,获得了演讲样本。样本共收到来自737名14种影响类别的人的评分2 211分。我们提出了一个有效的分类任务,即通过微调一个在TED演讲的大型数据集上预先培训的模型来预测这些类别。我们使用了来自最先进的自动语音识别系统和从自动文本分析系统获得的大量人际交流语言特征的组合。所有14种评分类别的分类精确度超过60%,“信息化”评级类别的最高性能为72%。在一次二次试验中,我们确定了不同群体使用SP-LIME的特征的相对重要性。