Mental health disorders affect a significant portion of the global population, with diagnoses primarily conducted through Mental State Examinations (MSEs). MSEs serve as structured assessments to evaluate behavioral and cognitive functioning across various domains, aiding mental health professionals in diagnosis and treatment monitoring. However, in developing countries, access to mental health support is limited, leading to an overwhelming demand for mental health professionals. Resident doctors often conduct initial patient assessments and create summaries for senior doctors, but their availability is constrained, resulting in extended patient wait times. This study addresses the challenge of generating concise summaries from MSEs through the evaluation of various language models. Given the scarcity of relevant mental health conversation datasets, we developed a 12-item descriptive MSE questionnaire and collected responses from 405 participants, resulting in 9720 utterances covering diverse mental health aspects. Subsequently, we assessed the performance of five well-known pre-trained summarization models, both with and without fine-tuning, for summarizing MSEs. Our comprehensive evaluation, leveraging metrics such as ROUGE, SummaC, and human evaluation, demonstrates that language models can generate automated coherent MSE summaries for doctors. With this paper, we release our collected conversational dataset and trained models publicly for the mental health research community.
翻译:精神健康障碍影响着全球相当一部分人口,其诊断主要通过精神状态检查(MSE)进行。MSE作为一种结构化评估工具,用于评价行为与认知功能在多个维度的表现,辅助精神健康专业人员进行诊断与治疗监测。然而,在发展中国家,精神健康支持资源有限,导致对精神健康专业人员的需求远超供给。住院医师通常负责初步患者评估并为资深医师撰写摘要,但其时间有限,致使患者等待时间延长。本研究通过评估多种语言模型,应对从MSE生成简明摘要的挑战。鉴于相关精神健康对话数据集的稀缺性,我们开发了一份包含12项描述性问题的MSE问卷,收集了405名参与者的回答,共获得9720条涵盖多维度精神健康议题的语句。随后,我们评估了五种知名预训练摘要模型(包括微调与未微调版本)在MSE摘要生成中的表现。通过综合运用ROUGE、SummaC及人工评估等指标,我们的研究表明语言模型能够为医生生成自动化且连贯的MSE摘要。本文同时向精神健康研究社区公开发布所收集的对话数据集及训练模型。