In recent years, world business in online discussions and opinion sharing on social media is booming. Re-entry prediction task is thus proposed to help people keep track of the discussions which they wish to continue. Nevertheless, existing works only focus on exploiting chatting history and context information, and ignore the potential useful learning signals underlying conversation data, such as conversation thread patterns and repeated engagement of target users, which help better understand the behavior of target users in conversations. In this paper, we propose three interesting and well-founded auxiliary tasks, namely, Spread Pattern, Repeated Target user, and Turn Authorship, as the self-supervised signals for re-entry prediction. These auxiliary tasks are trained together with the main task in a multi-task manner. Experimental results on two datasets newly collected from Twitter and Reddit show that our method outperforms the previous state-of-the-arts with fewer parameters and faster convergence. Extensive experiments and analysis show the effectiveness of our proposed models and also point out some key ideas in designing self-supervised tasks.
翻译:近年来,在线讨论和社交媒体意见分享方面的世界商业正在蓬勃发展。因此,提出了重新进入预测任务,以帮助人们跟踪他们希望继续进行的讨论。然而,现有工作的重点只是利用聊天的历史和背景信息,忽视对话数据的潜在有用的学习信号,例如对话线模式和目标用户的反复参与,这有助于更好地了解目标用户在对话中的行为。在本文件中,我们提出了三项令人感兴趣和理由充足的辅助任务,即“扩展模式 ” 、 “重复目标用户”和“转变作者”,作为自我监督的再进入预测信号。这些辅助任务以多任务方式与主要任务一起培训。从Twitter和Redit新收集的两套数据集的实验结果显示,我们的方法超越了先前的状态,没有多少参数和更快的趋同。广泛的实验和分析显示了我们拟议模式的有效性,也指出了设计自我监督任务的一些关键想法。