Personality computing has become an emerging topic in computer vision, due to the wide range of applications it can be used for. However, most works on the topic have focused on analyzing the individual, even when applied to interaction scenarios, and for short periods of time. To address these limitations, we present the Dyadformer, a novel multi-modal multi-subject Transformer architecture to model individual and interpersonal features in dyadic interactions using variable time windows, thus allowing the capture of long-term interdependencies. Our proposed cross-subject layer allows the network to explicitly model interactions among subjects through attentional operations. This proof-of-concept approach shows how multi-modality and joint modeling of both interactants for longer periods of time helps to predict individual attributes. With Dyadformer, we improve state-of-the-art self-reported personality inference results on individual subjects on the UDIVA v0.5 dataset.
翻译:个人性计算在计算机视野中已成为一个新出现的专题,原因是它可以使用广泛的应用。然而,关于这个专题的大多数工作都侧重于分析个人,即使适用于互动情景,而且时间很短。为了解决这些局限性,我们提出Dyadext, 这是一种新型的多式多主题变形器结构,用可变时间窗口模拟dyadic互动中的个人和人际特征,从而能够捕捉长期的相互依存关系。我们提议的跨主题层使得网络能够通过注意力操作明确模拟各主体之间的相互作用。这种概念验证方法表明,双方互动者的多模式和联合建模在较长的时间内如何有助于预测个人属性。我们与Dyadex一起改进关于UDIVA v0.5数据集单个主题的最新自我报告人格推断结果。