Vocal entrainment is a social adaptation mechanism in human interaction, knowledge of which can offer useful insights to an individual's cognitive-behavioral characteristics. We propose a context-aware approach for measuring vocal entrainment in dyadic conversations. We use conformers(a combination of convolutional network and transformer) for capturing both short-term and long-term conversational context to model entrainment patterns in interactions across different domains. Specifically we use cross-subject attention layers to learn intra- as well as inter-personal signals from dyadic conversations. We first validate the proposed method based on classification experiments to distinguish between real(consistent) and fake(inconsistent/shuffled) conversations. Experimental results on interactions involving individuals with Autism Spectrum Disorder also show evidence of a statistically-significant association between the introduced entrainment measure and clinical scores relevant to symptoms, including across gender and age groups.
翻译:横向内分泌是人类互动中的一种社会适应机制,其知识可为个人认知行为特征提供有用的洞察力。我们提出一种符合情理的方法来测量dyadic对话中的声带内分泌。我们使用校对者(结合进化网络和变压器)来捕捉短期和长期的谈话背景,以模拟不同领域互动中的内分泌模式。具体地说,我们使用交叉关注层来学习dyadic对话中的个人内部和人际信号。我们首先验证了基于分类试验的拟议方法,以区分真实(不一致)和假(不一致/打碎)的谈话。与自闭症分解障碍者互动的实验结果也表明,在引入的内分泌计量和与症状有关的临床分数之间,包括不同性别和年龄群体之间,存在着具有统计意义的关联。