People often utilise online media (e.g., Facebook, Reddit) as a platform to express their psychological distress and seek support. State-of-the-art NLP techniques demonstrate strong potential to automatically detect mental health issues from text. Research suggests that mental health issues are reflected in emotions (e.g., sadness) indicated in a person's choice of language. Therefore, we developed a novel emotion-annotated mental health corpus (EmoMent), consisting of 2802 Facebook posts (14845 sentences) extracted from two South Asian countries - Sri Lanka and India. Three clinical psychology postgraduates were involved in annotating these posts into eight categories, including 'mental illness' (e.g., depression) and emotions (e.g., 'sadness', 'anger'). EmoMent corpus achieved 'very good' inter-annotator agreement of 98.3% (i.e. % with two or more agreement) and Fleiss' Kappa of 0.82. Our RoBERTa based models achieved an F1 score of 0.76 and a macro-averaged F1 score of 0.77 for the first task (i.e. predicting a mental health condition from a post) and the second task (i.e. extent of association of relevant posts with the categories defined in our taxonomy), respectively.
翻译:人们经常利用在线媒体(如脸书、Reddit)作为表达心理痛苦和寻求支持的平台。 最先进的NLP技术显示极有可能自动从文本中发现心理健康问题。 研究表明,心理健康问题反映在一个人选择语言时所表现的情感(如悲伤)中。 因此,我们开发了一个新的情感-附加说明的心理健康保护(EmoMent),由来自两个南亚国家-斯里兰卡和印度的2802个Facebook文章(14845个句子)组成。 我们的RoBERTA模型从第一个任务类别(预测我们的第一个任务类别)和第二个任务类别(预测我们的第一个任务类别(预测我们的第一个任务类别)的“心理健康”和第二个任务类别(预测我们的第一个任务类别)的“心理健康”分数为0.76和宏观平均F1分(预测我们的第一个任务类别)。