Conversations among online users sometimes derail, i.e., break down into personal attacks. Such derailment has a negative impact on the healthy growth of cyberspace communities. The ability to predict whether ongoing conversations are likely to derail could provide valuable real-time insight to interlocutors and moderators. Prior approaches predict conversation derailment retrospectively without the ability to forestall the derailment proactively. Some works attempt to make dynamic prediction as the conversation develops, but fail to incorporate multisource information, such as conversation structure and distance to derailment. We propose a hierarchical transformer-based framework that combines utterance-level and conversation-level information to capture fine-grained contextual semantics. We propose a domain-adaptive pretraining objective to integrate conversational structure information and a multitask learning scheme to leverage the distance from each utterance to derailment. An evaluation of our framework on two conversation derailment datasets yields improvement over F1 score for the prediction of derailment. These results demonstrate the effectiveness of incorporating multisource information.
翻译:对话建模预测失轨现象
在网络用户之间的对话中,有时会出现失轨现象,即会陷入到人身攻击的状态。这种失轨现象对于网络社区的健康成长具有负面影响。能够预测正在进行的对话是否有可能失轨,可以为对话参与者和管理员提供有价值的实时信息。以往的研究多是回顾性地预测对话失轨,无法及时采取措施。也有一些研究试图进行动态预测,但是未能考虑到对话结构和离失轨点的距离等多源信息。本文提出了一种基于分层Transformer框架的方法,结合语音和对话层级信息来捕捉细粒度的上下文语义。同时,引入领域自适应预训练目标,以整合对话结构信息和多任务学习方案来利用每个话语到失轨点的距离。在两个失轨数据集上的评估结果表明,本文提出的方法在失轨预测的F1分数上均有所提高,证明了多源信息的有效性。